Show HN: Hands-On Cloud Troubleshooting Labs – AWS, GCP, Azure, and Kubernetes

by lumakson 6/18/2024, 4:24 PMwith 7 comments

Hello HN,

I’m excited to share something we’ve been working on: Cloud Troubleshooting Labs. This tool provides on-demand environments for engineers to practice hands-on troubleshooting with AWS, GCP, Azure, and Kubernetes.

What it does: We use Pulumi to spin up these environments. Each scenario includes a diagram and an explanation of the available IT infrastructure. Engineers are tasked with fixing configuration issues to make the application work. Depending on the test type, the environment could range from a simple terminal to a full cloud environment or a web-based VSCode. Many of our scenarios generate a random set of problems each time the test is started. This was initially designed for companies to prevent cheating. Now released for engineers, you can redo the same test but face different problems each time.

Why we built this: Initially, I created these labs to assess the hands-on skills of engineers we were hiring for senior cloud engineer/SRE roles. Selling to companies was tough; the best conversations we had were with engineers. So, we decided to release personal plans where engineers can practice troubleshooting in realistic environments.

Technical details: Environments are provisioned using Pulumi. We use a microservice architecture, which developed historically because the first skills assessment was designed for Kubernetes. Inspired by Katacoda for the user experience. Our last load test spun up 100 environments in parallel, taking around 10 minutes. For this launch, we’ve increased the number of replicas for core services to handle potential spikes in users. We generally expect some things to break if there is enough interest, but that will be a good sign for us. If you find any security bugs, please report them to info@brokee.io.

Relatable experience: The main pain point I wanted to solve was working with engineers who, even after months of training, struggled with new problems. I wanted to see if people could jump into a broken environment and figure out what’s wrong. A fun story: one student couldn’t sign up as he didn’t get an invite to an early version of the website, so he hacked us a bit. I’ll leave the details out for now, but I’d want to hire him for his persistence!

Future plans: In the near future, we plan to add an option to select problem areas you’d like to practice, such as Linux networking or working with the filesystem. This will generate problems related to those areas, providing a more guided experience. However, we don’t want to be like other platforms where you are told what to do and just repeat the steps.

Engineers can sign up here and try 5 tests for free (except full cloud environments, which aren't available on the free tier): https://brokee.io/devopsskillstests. There is a similar offer for companies on the main page if someone is interested in trying the product with a team.

We’re eager for feedback, so please give it a try and share your thoughts!

by diamonceon 6/26/2024, 10:36 AM

I love the interview gamification aspect of this. So much fun actually. Great product!

by matelangon 6/18/2024, 6:23 PM

By far the most enjoyable interview experience I had. Hands-on and efficient. Kudos to Maksym and the brokee team!

by borgzlon 6/18/2024, 9:43 PM

Cool. Do you allow some kind of certification too?