Claudia Solís-Lemus, a statistician by practice and an assistant professor at the University of Wisconsin-Madison in the Department of Plant Pathology, develops software as part of her research group when she is not busy swimming across Devils Lake in Wisconsin. The software created is all open source in hopes to help benefit the scientific community in any regard.
One of her most prominent open-source projects is a Julia package that she created during her Ph.D. at UW-Madison. The fully open-source tool allows evolutionary biologists to better reconstruct the tree of life from genetic sequences.
“I think it’s one of the most impactful ones because I have met many evolutionary biologists that have used it on their data,” Solís-Lemus said. “This makes me very happy that it is a product that is being used by the community.”
When creating these projects, she finds it imperative to have adequate documentation to help guide a user through the repository. A ‘readme file,’ as Solís-Lemus called it, can be a beneficial file that serves as a guide through the repository, showing users what the different parts of the project are, where to find them and more.
“The files that you have and the content of your folder is not so much for other people, which they profit from, but even yourself a year from now. You’re not going to remember what you did in this code,” Solís-Lemus said.
In open source, creating a repository so that others can have full access and collaboration is extremely helpful when using it for continuous research. Solís-Lemus not only wants her software to be available but also to make it available in a useful way so that others can understand how to implement it in their own research.
“One of the challenges is that not everybody’s on the same page on the importance of that,” Solís-Lemus said. “So just trying to explain to people why it’s important to have these standards for code.”
Solís-Lemus focuses on this documentation with students and postdocs who work with her. With training tutorials on best computing practices, best ways to document code, best folder structure and more, she aids her students to become better open-source practitioners who are able to contribute more high-quality projects to the open-source community.
One of the main challenges with open source is maintaining the project, but for Claudia, it can be a challenge when her students graduate. Before they go, she reviews the repositories with them to ensure everything is in check and she knows where everything is. It can be difficult to hire a person to maintain these repositories due to funding, but at the least, she makes sure her students have crossed all their T’s and dotted their I’s, and when they graduate they have a well-documented project to show, as she described it.
Along with keeping repositories well documented to ensure others are able to use and maintain them, open-source projects give an advantage with more people being able to look and edit it, according to Solís-Lemus.
“I feel like as one individual you have limitations. You, of course, do your best to put the best product forward, but it’s always better, like we do that with peer review for publications,” Solís-Lemus. “You have other people read and evaluate your work to make it up to standards for the scientific community.”
As a developer, it is important to make code understandable for other users. The programs being open source — and being able to be peer-reviewed by many people in the open-source community — can give an advantage with this. Especially when people don’t want to spend a lot of time looking through unorganized repositories. Whether it is being improved or criticized by others, this can always help make code and programs even better than before.
Being educated on open source before getting started can help programmers develop stronger open-source code. Learning the process of coding for open source before beginning can make the process go smoother, but it is still a learning process and no one will be perfect on the first day.