Advanced Test Harness Infrastructure for Validating ARM and FPGA-based Systems

Video size:

Abstract

Delivering quality in the fast-evolving landscape of software and hardware integration requires automated, rigorous, and continuous testing - this is where our innovative test harness comes in. Let’s learn how to enhance QA and streamline testing by using both general and custom-built technologies.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hello, everyone, and thank you for joining me today at COM 42 DevOps conference. My name is Stefan and I work as DevOps engineer at Analog Devices. I'm excited to present you today an automated way of testing in hardware across a wide range of platforms that run different Linux distributions. System level testing plays an essential role for quality assurance in the fast evolving landscape of software and hardware integration. A robust testing infrastructure is required to ensure that individual software components work together. Key attributes like automation, rigorous testing, and coverage are some of the must haves for achieving a reliable integration. By implementing these best practices, Organizations can deliver high quality systems that meet user expectations. A reliable test infrastructure accelerates also test executions, but most important, reduce the probability of finding bugs in production phase. The Linux distribution I talked about is called Analog Devices Kuiper and it is a free open source distro customized for analog devices signal chain. It comes pre equipped with essential components like device drivers, pre built boot files or FPGA, and Raspberry Pi based solutions, as well as a wide range of development utilities, libraries, and project examples. The system supports multiple hardware platforms, AMD and Intel FPGA based platforms, Raspberry Pi, and also NXP. To streamline software management, KyperLinux incorporates a custom Linux package repository. Simplifying software components installation or update. This is included by default in the image ensuring easy of use for both testing and production environments. For more details regarding how Kyper Linux releases got optimized, please check Refining the release strategy of a custom Linux distro that is also presented at this edition of Comfortitude DevOps by one of my colleagues, Andrea. The continuous integration flow for individual software components is handled using classic CI tools, such as Jenkins, ErrorPipeline, or GitHub Actions. The CI builds software components across various operating systems and the resulting output binaries, which might be Linux packages, Windows installers, or just archives, are stored in GitHub releases, package manager, or internal server. This streamlined approach not only improves efficiency, but also ensures traceability, version control, and artifacts accessibility, making it easier to integrate and deploy them into larger systems. The testing process in hardware closely mirrors the one in software. The workflow typically begins with specific boot files being written onto the hardware boards to configure them for testing. Tests are executed in parallel across multiple hardware setups. Reducing overall testing time and increasing efficiency. Because popular testing framework available in the market even are designed for application level testing or are specific to a single hardware platform we worked on creating our own testing framework called Hardware Test Harness. It is designed to unify testing across a wide range of hardware platforms enabling consistent execution regardless of the hardware type. Builds are triggered by pushes or pull requests in GitHub. Multiple repositories are monitored like HDL, Linux, libraries or applications. And changes from those will trigger the entire testing process. Let's see what happens after the build passes. The results binaries are saved on internal servers, in this case JFrog Artifactory. And then the main Jenkins job is triggered. The Jenkins job begins by downloading the binaries from Artifactory and writing them onto the hardware platforms connected to each agent. After this, the automated tests are distributed and executed. The tests are mostly written in Python and have specific tags that indicate hardware compatibility. Let's see now why do you think we need an intermediate Artifactory server. this server is required for several reasons. The main one is that the hardware test harness is distributed across multiple physical locations, requiring a centralized system to share binaries. Another reason is to handle concurrency. Since multiple repositories are implied, this will act as a bumper. And it also ensures better organization and versioning of binaries, making the process more scalable. Another question may be why the Jenkins agents need to be physical machines. The primary reason here is that these agents directly control the hardware setups, including power management, WART connections, Ethernet, and in some cases JTAG interface. Some of the build servers are also physical machines, because builds require significant computational resources or paid licenses, as Xilinx or Matlab cases. A good framework should be adaptable to different types of hardware and testing scenarios. It should be modular enough to accommodate changes, such as adding new DUTs or modifying test cases. Because there are implied multiple repositories and multiple Jenkins agents, there is necessary a test manager to queue, distribute, and execute tests. But let's go first through some implementation details. As main tool, there was used Jenkins. It can be hosted independently without relying on external servers, integrates easily with tools like GitHub or Artifactory, and benefits from a large online community for support. Additionally, features as Jenkins share libraries, dynamic script language, and resource logging mechanism proved to be extremely useful in this context. Another tool used is Nebula. This tool was developed by us and consists in a collection of Python scripts that manage hardware connections, such as sending word commands, configuring Ethernet IPs, sending files through SSH and so on. In the event of a hardware setup failure, we can physically reboot the system and bring it back online using Power Distribution Unit and USB SD card mobs. Both of them are also controlled through Python by Nebula. As we added more harder setups, we needed a tool to keep tracking of all of the devices under test. And that's where Netbox comes in. It is a free open source tool, originally designed for modeling and documenting network racks, but it also fits in our use case. We use it to generate Nebola configuration files, YAMLs that contain information about each DUT, such as platform, the board that is plugged into it. Ethernet and serial addresses, PDU outlet and USB connections. The NetBox configuration needs to be updated only when new duties are added, removed or rearranged. All data stored in NetBox is backed up automatically in Artifactory and it can be restored if something goes wrong. Jenkins. JAR library is a very good way to centralize groupy code. https: and reuse it in multiple Jenkins pipelines. It contains definition for common functions and pipeline steps that can be shared across different Jenkins files. Improving in this way reusability, modularity, and consistency. We use it in multiple pipeline stages to update agent tools as Nebula, to send files to hardware setups, or to run tests and collect results. this structured approach ensures an efficient process of updating and maintaining the same pipeline functionality across all the test harness instances. By combining continuous integration with continuous testing, the resulting diagram will look like this. Behind it are over 100 CI pipelines implied, implemented in Azure, GitHub Actions, or Jenkins, and about 15 physical build servers. For most of them, besides build status, the test result from hardware testing is returned back to github pull request. Some software components, such as libraries, are tested individually on hardware. If all of the test passes, the corresponding binaries are stored in artifactory or Linux package repository. For other components, the build artifacts are first stored on internal servers and tested afterwards. In some cases, Linux packages are created automatically at each push and saved into the package repository. Ideally, once all the software components are packed as Linux packages, those packages generated by each CI run will be uploaded automatically under testing environment. So they can be installed on Kyper for further testing or just used internally as pre released versions. On the other side, whenever there are changes in the Kyport sources, new Docker containers are created, being used by other CIs. This ensures that everything is consistently built and validated across multiple environments. Let's see how results visualization was handled. The easiest way was to manually verify status of Jenkins pipelines. Of course, this method didn't give us any details about what stages Failed, and on which hardware setup. So we switch to BlueOceanView. It looks a bit better. For those of you who don't know, the BlueOcean is a Jenkins plugin that offers a good visualization of parallel stages. In this case, we could see the status of all the stages on all the hardware setups, becoming a bit easier to know exactly what failed. I still couldn't see all the details. So next step was to convert results into XML format and use JUnit Jenkins plugin for visualization. Even this method requires logging into Jenkins and visually inspect the results at every run. So we have started to use even more powerful tools. Logstash for processing results, Elasticsearch for storing them into database and Kibana for generating graphs. At this point, we also created a web page with multiple dashboards and added the ability to create and apply filters. The new implementation eliminates the need of going through individual artifacts to check results. But developers were still needed to look over the graphs to check the status before pushing their changes. Actually in all of the above cases, developers were still needed to manually check the results. Even if the results were shown in tables, graphs, or dashboards, it was not feasible to handle a big number of repositories and pull requests in this way, so we needed to close the loop completely. An important step in the implementation was to bind hardware test results back to GitHub pull requests. The main challenge here was to ensure that private data from our internal build and test environment, such as internal IPs, Jenkins links, or any other sensitive information, to not be exposed in public repositories. At the same time, it was very important to provide sufficient information about the build status and testing results to aid developers. To achieve this, multiple tools were used to parse results, merging them into the same tables with build statuses, securely tunneling SSH connections, and posting summaries via GIST. With this system in place, we were finally able to enable, require CI status to pass setting from GitHub repository. This ensures that only changes that pass builds and don't broke any test are allowed to be merged, increasing the overall stability and reliability of the code base. We can dive deeper into this topic by accessing secure integration of private testing infrastructure with GitHub. public github repositories presented by my colleague Bianca at the same edition of home 42 devops. The final step needed to achieve a fully automated testing framework was to implement a mechanism for recovering harder setups from bad states. One common issue arises when boot files produced by CIs are faulty. In this case, harder setups can hang during the boot process and remain stuck in the unstable state. The framework detects these failures and attempts to recover the affected boards through various methods. As part of this process, we maintain a set of golden files, a reliable baseline of boot files that are overwritten with the latest set of files that passes successfully. They serve as a fallback option, allowing to set the hardware systems back up and running. And of course, to be prepared for next set of files to be tested. However, the rare scenario when the hardware setups are physically damaged remains the single situation that requires manual intervention. This recovery mechanism ensures that the testing framework remains resilient, minimize the downtime, and increase the efficiency of testing framework. Now that I have gone through all the details, let's see how the overview design looks like. On the left side, you can see the triggering mechanism, multiple Jenkins files and Jenkins server. The server manages the testing requests queue, ensuring efficient resource allocation and also merges results from all test harness instances and prepare them to be published. Then are the Jenkins agents. By deploying agents inside Docker containers, we have successfully connected multiple hardware setups into a single physical machine, optimizing resource usage. Test hardness supports tests written in different programming languages as Python, C or Matlab. Hardware boards are locked only when the tests are running on them, otherwise they remain accessible for remote connections, allowing team members to perform debugging and development work. Test results are very well structured and presented clearly, ensuring that any defects are identified and addressed in early stages. This system increases efficiency, maintains stability, and enhances the overall reliability of the testing process. But let's see how the hardware setup looks in real life. This is how the prototype looked in early beginning stages. There were just a few hardware boards connected between them and lying around on a desk. We were experimenting at that time using Raspberry Pi as Jenkins agent. and adding support for multiple platforms. And this is how Test Harness looks now. In conclusion, we have managed to implement a very complex testing framework that can be triggered from multiple GitHub repositories, as well as Jenkins, Cron, or even manually. Hardware setups remain accessible for remote connection, allowing team members to perform debugging and development. It supports multiple platforms and can run tests written in different languages. Resources got optimized by using Jenkins agents inside Docker containers, and there is a robust recovery mechanism in place. Test results are well structured and bind back to GitHub, ensuring that bugs are found as early as possible. The presented testing framework is highly efficient, flexible, and robust, designed for complex workflows. It ensures software and hardware integration, streamlining the testing process, and enhancing the stability. Thank you all for listening me, if you have any related questions or need more details don't hesitate to contact me. Have a nice day, bye.

Slides

Download slides (PDF)

See all 58 talks at this event!

Conf42 DevOps 2025 - Online

January 23 2025 - premiere 5PM GMT

Advanced Test Harness Infrastructure for Validating ARM and FPGA-based Systems

Video size:

Abstract

Summary

Transcript

Slides

Stefan Raus

DevOps Engineer @ Analog Devices

Join the community!

Featured event

2025

2024

Info

Conf42 DevOps 2025 - Online

January 23 2025 - premiere 5PM GMT

Advanced Test Harness Infrastructure for Validating ARM and FPGA-based Systems

Video size:

Abstract

Summary

Transcript

Slides

Stefan Raus

DevOps Engineer @ Analog Devices

Join the community!