Python software metrics- my first useful OS project?
I’ve tried to open-source code quite a few times, but the projects have been niche enough that they haven’t been very useful. Well, I finally have something universally useful.
I’ve take an interest in code metrics recently (as documented on this blog) and I have been quite upset to learn that there are few good tools for measuring them in Python code. PyLint and PyChecker and the like are not what I’m talking about- I want dependency graphs, measure of cyclomatic complexity, automatic coverage analysis, etc.
So basically what I’m doing is creating a framework that wraps a bunch of existing functionality into an easy-to-use system, and expands or refactors it where necessary. My goal is to make it a ‘drop in’ system to it will be trivial to get thorough code metrics for your codebase (similar to how simple it is to do in Visual Studio).
Right now I’ve created a SLOC (Source Lines of Code) generater, a wrapper for nose and coverage, and hooked it up to pygenie to measure Cyclomatic Complexity- which is unfortunately going to need a significant refactoring, so I won’t be able to fork it directly. I’ll be hooking it up into our automated test framework at work this week as well for some battle testing. I’m 100% sure there’s a good deal of extensibility and configuration adjustments I’ll need to make to support alternative setups. Next up will be automatic generation of dependency graphs (which doesn’t look easy at all, unfortunately). And writing tests (this is the first project that I didn’t sort-of-TDD in a while). Oh, and getting it into Google Code.
Is this something you guys can see hooking into your codebases? Do you see the value of and want to find out metrics of your codebases?
Oh and it’s tentatively called ‘pynocle’, if you have a better name I’d love to hear it.
“Do you see the value of and want to find out metrics of your codebases?”
Emphatically yes. I’ve been working on some code that lexes and parses maxscripts and c# to intermediate representations. I’d really like to include features such as cyclomatic complexity, dependencies, code completion, even “paradigm shifts” (functional programming being refactored to oop (and back and forth)). I’m writing it because I couldn’t find any decent open source (or closed source) project that did what I wanted. With the addition of language translation, I can see a program that could potentially lex and parse any language to an intermediate representation, do various types of code metrics on that IR, and then output that IR to another language (taking into account language specific commands and notifying when a translation has errors/failed).
Based on your experience, do you think a program such as this would be useful to programmers?
Grak, have you looked at something like PyPy regarding intermediate languages/systems? I think what you’re talking about is very interesting but incredibly complex, if it can even be done well (there are simply some things that cannot be successfully represented in certain languages, like lambdas). I think it’d be very difficult to do code analysis on some intermediate code because the idioms are too different. It’d be like operating on IL, instead of C#, which is not fair (the IL can be many times more complex than the C# program used to create it).
I think it’d be very useful if it were possible but I don’t think it’s possible :(
Rob, good points. Perhaps it’s possible to build the IL from a flow control graph representing the original idioms? When I have something to share, I’ll be sure to send you a link. :)
Sounds good Grak, even if you decide it isn’t possible/worth it, I’d like to know what you come up with.