When “ sciencing” over big sets, especially during exploratory phases with the business units, I find that I’ll often work in a mix of REPL and scratch file (python) for screencasting and/or someone over your shoulder type situations… these inevitably end up as a heaps of barely stable level code to endlessly chopped up files…

It’s the nature of the beast and these kinds of explorations aren’t worth into full fledged projects that I’d put in a git repo, but I’m starting to get aggravated by the slash and dashness of it. By far, the most aggravating thing being matplotlib generated graphs that are shared in our internal message boards. These graphs get shared specifically because they capture insights very well, but I find that it’s pretty much impossible to make them “carry their baggage” with them: code and the underlying data…

I tried working with jupyter notebooks for a while, but couldn’t quite get a feel for it for this type of exploratory work…

What’s you guys’ workflows around this type of work? How do you preserve the slightest amount of retraceability in your exploration?

Source link

No tags for this post.


Please enter your comment!
Please enter your name here