Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 22462

S. Lott: A Reason for Avoiding Programming

$
0
0
From someone in the process of becoming a data scientist. They had a question on regular expressions, which made almost no sense. It appears that the core concepts of ETL -- Extracting source data, Transforming it into a useful form and the Loading into some persistent storage for long-term analysis -- had not been embraced. It appears the design pattern was unknown. All I could gather from the sketchy email chain was that something involving regular expressions had become difficult.

I wrote this in response: Handling Irregular File Formats.

Here's part of the follow-up.

"I have been focusing on the math associated w/ math optimization. I have been using spreadsheets to perform the computations."

Really.

Spreadsheets.

The ETL pipeline question/rant/complaint was part of loading a spreadsheet?

That seems somehow wrong. There are real tools available that really do real data science work. The word "optimization" hints that scipy.optimize might be a more useful exercise than hacking around with spreadsheets.

Perhaps some advice from a real data scientist might help: http://www.becomingadatascientist.com


Viewing all articles
Browse latest Browse all 22462

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>