Thursday, March 18, 2010

Converting C code to Python

This may sound like a weird thing to do, but actually, I have craved for something like that every time I have to read C code. Converting C code to Python, can not only help us understand code more easily, but also turn non perfomance-critical code easier to maintain. C can be easy enough to read if well-written and formatted (indented) adequately, however having to move back and forth between .h and .c files drives me mad sometimes.

You can imagine my joy when I came across "ctopy", create by none other than Eric Raymond himself!

The first problem I ran into was the fact that ctopy was not available on my Ubuntu. No problem, a quick search through Eric's web site lead me to it:

Naturally, ctopy is written in python and my immediate impulse was to download and test it.

For test code I decided to download a small C program from the "computer language shoutout" site:

Unfortunately even for this simple program, ctopy didn't do a good job. I am not including the results here. but you can easily replicate them with the command:

cat mandelbrot.c | indent |./ctopy

The man page of ctopy recommends to pipe the C code through indent, because it relies on the indentation of the C code to do the translation.

Well as Raymond himself says in the beggining of ctopy's man page, its a quick and dirty translator, which requires a human to finish the job.

Anyway, that's how far my dream of a "Google Translate" for code went. I decided to post the links in case someone decides to continue where Raymond stopped.
Reblog this post [with Zemanta]


Anonymous said...

I once tried to make a tool to do this, but never quite solved the problem of globals in one module depending on globals of another, even for initialization. Then there are macros and other preprocessor statements that can modify code within a statement...

Eventually, you might have to make a choice: Do you want the to execute correctly, or would you rather be able to read it?

Jabapyth said...

So I have done some serious work towards a "code translator" which started out as just a class project, but has gone a bit beyond that. The basic idea is to take some code, use a BNF grammar definition, and parse it into its underlying code structures. From that point it would be fairly simple to 'encode' this into a specific language.
so far it hasn't become a reality yet (i'll probably put the code on github soon and start working on it again), but the possibility is definitely there.

Flavio Coelho said...
This comment has been removed by the author.
Flavio Coelho said...

@jabapyth: I was surprised that Raymond did not rely on a proper parser such as ply for this job. I agree that it is doable, if you end up finishing it please blog about it so that the community gets to know about it.

Gabi said...

I'd like to comment on your first paragraph, where you write: "Converting C code to Python, can not only help us understand code more easily, but also turn non perfomance-critical code easier to maintain."

In one of my projects, we re-engineered an existing C application; in the process we not only reduced the length from more than 12 pages printout to roughly 4 pages, made it more readable and maintainable, but at the same time got a decent understanding of the applied algorithm for the first time. And analysing this algorithm showed that the C programm applied an algorithm of exponential complexity. And we could replace this with an algorithm of linear complexity. Which resulted in a drop in run time from 1min (as C program) to 1 secs (in Python). Of course, I'm comparing apples to oranges, but you still get the idea...

Jabapyth said...

I've put the code up on github:

currently I have an example which can fully parse and prettyfy C code; from here it's not too big a step to outputting python. Most of the work is in crafting an effective & efficient BNF grammar.

Flavio Coelho said...

@Jabapyth: Good job! I'll be glad to blog about "codetalker" once it gets more mature.

Davide said...

I've just discovered the c-to-python command in leo programmer's editor. Documentation is scarce, there is a learning curve, and of course the result is not usable out of the box. But it seems better than the other options mentioned in this post, so here it is the link