Company Blog
By: Sheamus McGovern, Open Data Science Conference Chair
At ODSC East, the most influential minds and institutions in data science will convene at the Boston Convention & Exhibition Center from May 20th to the 22nd to discuss and teach the newest and most exciting developments in data science.As you know, the Python ecosystem is now one of the most important data science development environments available today. This is due, in large part, to the existence of a rich suite of user-facing data analysis libraries.
Powerful Python machine learning libraries like Scikit-learn, XGBoost and others bring sophisticated predictive analytics to the masses. The NLTK and Gensim libraries enable deep analysis of textual information in Python and the Topik library provides a high-level interface to these and other, natural language libraries, adding a new layer of usability. The Pandas library has brought data analysis in Python to a new level by providing expressive data structures for quick and intuitive data manipulation and analysis.
The notebook ecosystem in Python has also flourished with the development of the Jupyter, Rodeo and Beaker notebooks. The notebook interface is an increasingly popular way for data scientists to perform complex analyses that serve the purpose of conveying and sharing analyses and their results to colleagues and to stakeholders. Python is also host to a number of rich web-development frameworks that are used not only for building data science dash boards, but also for full-scale data science powered web-apps. Flask and Django lead the way in terms of the Python web-app development landscape, but Bottle and Pyramid are also quite popular.
With Cython, code can approach speeds akin to that of C or C++ and new developments, like the Dask package, to make computing on larger-than-memory datasets very easy. Visualization libraries, like Plot.ly and Bokeh, have brought rich, interactive and impactful data visualization tools to the fingertips of data analysts everywhere.
Anaconda has streamlined the use of many of these wildly popular open source data science packages by providing an easy way to install, manage and use Python libraries. With Anaconda, users no longer need to worry about tedious incompatibilities and library management across their development environments.
Several of the most influential Python developers and data scientists will be talking and teaching at ODSC East. Indeed, Peter Wang will be speaking ODSC East. Peter is the co-founder and CTO at Continuum Analytics, as well as the mastermind behind the popular Bokeh visualization library, the Blaze ecosystem, which simplifies the the analysis of Big Data with Python and Anaconda. At ODSC East, there will be over 100 speakers, 20 workshops and 10 training sessions spanning seven conferences that focused on Open Data Science, Disruptive Data Science, Big Data science, Data Visualization, Data Science for Good, Open Data and a Careers and Training conference. See below for very small sampling of some of the powerful Python workshops and speakers we will have at ODSC East.
●Bayesian Statistics Made Simple - Allen Downey, Think Python
●Intro to Scikit learn for Machine Learning - Andreas Mueller, NYU Center for Data Science
●Parallelizing Data Science in Python with Dask - Matthew Rocklin, Continuum Analytics
●Interactive Viz of a Billion Points with Bokeh Datashader – Peter Wang, Continuum Analytics