There are many opportunities for Machine Learning in Software Engineering: testing, product management, processes… It might even set a sustainable pace!
This is the last of a 3 posts story about how we tried to apply machine learning to software engineering. The previous post was about how a 6-month internship in Machine-Learning confirmed software-engineering best-practices. If you haven’t, start reading from the beginning.
This was only a 6 months internship. We only had a glimpse at machine learning and data science for software engineering.
Data science for software testing is becoming a hot topic. I’ve noticed an increasing number of talks about data science at testing conferences.
I’ve heard the most about fuzzing smart inputs and creating an AI for large test failure analysis.
The cucumber pro team also seems to be looking into the same problem as Ismail did.
At the developer level, IDEs could embed an AI similar to Ismail’s. It could analyze every test run and identify the tests which are the most likely to fail. Running tests in a particular order could save a few seconds every time. This could particularly improve continuous testing tools like NCrunch or InfiniTest.
💡 A smart AI could record what’s done on the IDE to provide working improvement ideas.
A new breed of tools that leverage data science is appearing. Here are a few I’ve heard about:
- Code Climate’s Velocity analyzes pull-requests to identify process bottlenecks
- Code Scene analyzes git-history to pin-point high-ROI refactorings
- LGTM analyzes quality and security violations to highlight the ones to fix now.
💡 LGTM team’s blog about data science and software engineering is pretty interesting!
Could it help us to learn best practices in product management? It might also support continuous-improvement, similarly to what Velocity does by analyzing PRs. It seems a team at Content Bloom has been successfully applying a data-driven mindset to an Agile Framework.
Data Science will improve software engineering in ways we cannot even think of today. What could we do if we combined data from all sources: ticketing-systems, VCSs, IDEs, static analyzers, CI servers, production…?
Proving ‘slow’ practices
Finally, a lot of the industry’s ‘best practices’ are about taking the time to do good work… to go faster in the end. Unfortunately, this is a hard message to sell when we are under the stress of fire-fighting bugs.
In 21 Lessons for the 21st Century Yuval Noah Hariri says that we will resort to AIs to make better decisions for us. With more knowledge and fewer biases, these decisions should be smarter. Daniel Kahneman supports a similar idea in Thinking Fast and Slow.
This sounds frightening! But this could also achieve a sustainable-pace in a profession that too often hasn’t.
Give it a try!
Believe it or not, it is not as difficult as it seems to get started with Data Science and Machine Learning. Especially for software engineers who already know how to manipulate data. Improving our processes is the perfect occasion to get started! 20 hours is all you need.
Unfortunately, there is also another, darker, reason to get into the topic. As data science can increase productivity, it will become part of our daily work, sooner or later. We’d rather learn and own the topic before it becomes the Pointy-Haired Bosse’s Big Brother dream…