Reuven Lerner: Weekly Python Exercise: Registration closes in two days

September 16, 2018, 12:00 pm

≫ Next: Podcast.__init__: The Business Of Technical Authoring With William Vincent

≪ Previous: Bhishan Bhandari: Python, Boto and AWS EC2

This is just a reminder that registration for the next cohort of Weekly Python Exercise, my course that combines exercises and community to turn you into an advanced Python developer, closes in just two days, on September 18th.

If you’ve always wanted to improve your understanding of such topics as functions, objects, decorators, generators, comprehensions, and lambda (among other things), then WPE is for you! I only open 1-2 cohorts per year, so if you want to level up your Python — and stop relying on Stack Overflow and Google to answer your questions — be sure to check it out.

With this cohort, I’m adding tests with PyTest to most exercise specifications! This means that you’ll not only get better at coding, but at testing, too.

You can read more at http://WeeklyPythonExercise.com/.

Remember: I offer discounts to students, pensioners, and residents of countries that aren’t among the world’s 30 richest. Just e-mail me at reuven@lerner.co.il for a coupon code.

And: Once you buy WPE, my “forever free” policy means that you can join future cohorts, too.

And of course: There’s a 100% money-back guarantee.

I’m sure that WPE is the best way to improve your Python, and thus improve your career as a developer or data scientist. Questions? Just e-mail me at reuven@lerner.co.il, and I’ll respond ASAP.

The post Weekly Python Exercise: Registration closes in two days appeared first on Lerner Consulting Blog.

↧

Podcast.init: The Business Of Technical Authoring With William Vincent

September 16, 2018, 6:49 pm

≫ Next: Mike Driscoll: PyDev of the Week: Younggun Kim

≪ Previous: Reuven Lerner: Weekly Python Exercise: Registration closes in two days

There are many aspects of learning how to program and at least as many ways to go about it. This is multiplicative with the different problem domains and subject areas where software development is applied. In this episode William Vincent discusses his experiences learning how web development mid-career and then writing a series of books to make the learning curve for Django newcomers shallower. This includes his thoughts on the business aspects of technical writing and teaching, the challenges of keeping content up to date with the current state of software, and the ever-present lack of sufficient information for new programmers.

Summary

Preface

Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute.
Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email hosts@podcastinit.com)
To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
Join the community in the new Zulip chat workspace at podcastinit.com/chat
Your host as usual is Tobias Macey and today I’m interviewing William Vincent about his experience learning to code mid-career and then writing a series of books to bring you along on his journey from beginner to advanced Django developer

Interview

Introductions
How did you get introduced to Python?
How has your experience as someone who began working as a developer mid-career influenced your approach to software?
How do you compare Python options for web development (Django/Flask) to others such as Ruby on Rails or Node/Express in the JavaScript world?
What was your motivation for writing a beginner guide to Django?
- What was the most difficult aspect of determining the appropriate level of depth for the content?
- At what point did you decide to publish the tutorial you were compiling as a book?
In the posts that you wrote about your experience authoring the books you give a detailed description of the economics of being an author. Can you discuss your thoughts on that?
- Focusing on a library or framework, such as Django, increases the maintenance burden of a book, versus one that is written about fundamental principles of computing. What are your thoughts on the tradeoffs involved in selecting a topic for a technical book?
Challenges of creating useful intermediate content (lots of beginner tutorials and deep dives, not much in the middle)
After your initial foray into technical authoring you decided to follow it with two more books. What other topics are you covering with those?
- Once you are finished with the third do you plan to continue writing, or will you shift your focus to something else?
Translating content to reach a larger audience
What advice would you give to someone who is considering writing a book of their own?
- What alternative avenues do you think would be more valuable for themselves and their audience?
- Alternative avenues for providing useful training to developers

Keep In Touch

Picks

Tobias
- Practical AI
William
- awesome-django
- The Digital Doctor by Robert Wachter

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

↧

Mike Driscoll: PyDev of the Week: Younggun Kim

September 16, 2018, 10:05 pm

≫ Next: Codementor: Variable assignment in python: behind the scenes

≪ Previous: Podcast.__init__: The Business Of Technical Authoring With William Vincent

This week we welcome Younggun Kim (@scari_net) as our PyDev of the Week! Younggun has been on the board of directors for the Python Software Foundation and is the founder of PyCon Korea. He has translated several programming books into Korean. You can get the full list on his website. You can also check his Github to see some of the projects he has worked on. Now let’s take a few moments to get to know him better!

Can you tell us a little about yourself (hobbies, education, etc):

I’m a Pythonista based in Seoul, Korea and leading engineering department in a video streaming company. I’m also actively involved in our community, especially in the East Asia region. I have served as a board director of the PSF for the 2016/17 term with nomination by Carol Willing. I started PyCon Korea for the first time in 2014 with several local community members here. I travel for 5 or 6 PyCons in a year with a nice PSF conference kit. I’m serving on the PSF Grants working group now.

Why did you start using Python?

I started to learn how to program when I was 8 years old. I had to learn the Alphabet first prior to learn programming because I was a non English speaking kid and never seen the Alphabet in my 8 years of life. Actually, I learned the Alphabet from keyboard and learned several English words like PRINT, GOTO, RUN, etc. from BASIC before I officially learn English from school.

Since then, I was obsessed with learning programing languages. It was no surprise I was interested in Python when I first heard about the new language from an IRC channel in the early 2000s. Hyeshik, the op of the channel was a Python enthusiast and became a CPython committer after several years of contribution. He introduced a lot of interesting things about Python and witnessing history of the fast-paced language was a great pleasure.

What other programming languages do you know and which is your favorite?

I know lots of programming languages, from older ones to esoteric ones. C was my favorite one since it’s the language I’ve used the longest. I would say Python is my favorite now. I often compared the C lang as driving a manual transmission car. It’s fun in a way but why bother with a manual car in traffic jams everyday?

What projects are you working on now?

I don’t code so much nowadays, especially at work. I did contribute to pandas with several commits in the past but recently, I prefer organizing development sprints and enjoy helping others to contribute to open source. When I do code, it’s only when I need to prototype quickly for decision making or automate tedious jobs. I could not say much, but I’m writing a prototype of a system that can detect specific content in a real-time video stream with Python.

Which Python libraries are your favorite (core or 3rd party)?

misspellings. Not a joke. misspellings is a library to check for misspelled words in source code. Rather than using it myself, I introduce it when giving a talk on open source contributions. What I emphasize in my talk is that, not only a heroic commits solving difficult problem but also reporting a bug, fixing errata, documentation, and donation are also a valuable contribution. I introduce misspellings to get people to start contributing for the first time easily. I believe that once they start, they will continue to try next and no one knows how wonderful it would be in the future.

Is there anything else you’d like to say?

Within our community, we feel a fellowship with others, as a result of sharing common interests, showing respect and courtesy to others without discrimination of any kind. Actually, this makes our community a better place.

As you may be aware, a peace on the Korean Peninsula is just around the corner. It has been the ardent dream of the Korean people. However, due to the 70 years of division of the Korean peninsula, vast differences between the South and North in lifestyle, wealth, political beliefs, and other matters might make us struggle everyday.

I hope Pythonistas in both South and North get together someday and transcend the differences between with what we learned from our community so that we can help to make a little contribution to the peace.

If any of reader of this blog knows a Pythonista in North Korea, would you tell them I will go to see them when the peace come and please wait for me until then?

Thanks for doing the interview!

↧

Codementor: Variable assignment in python: behind the scenes

September 17, 2018, 3:49 am

≫ Next: PyBites: Code Challenge 51 - Analyse NBA Data with SQL/sqlite3

≪ Previous: Mike Driscoll: PyDev of the Week: Younggun Kim

Variable assignment statement in python.

↧

PyBites: Code Challenge 51 - Analyse NBA Data with SQL/sqlite3

September 17, 2018, 6:45 am

≫ Next: Gaël Varoquaux: A foundation for scikit-learn at Inria

≪ Previous: Codementor: Variable assignment in python: behind the scenes

It's not that I'm so smart, it's just that I stay with problems longer. - A. Einstein

Hey Pythonistas,

Blog Code Challenges is back! And with a vengeance ;)

Starting today we will publish a new code challenge every week on Monday. On Friday (or latest the weekend) we will post a review.

Welcome to Pybites Code Challenge 51! In this challenge we get you analysing NBA player data from a CSV file.

The Challenge

If you are reading this on our blog head over to https://codechalleng.es/challenges/51.

If you need help getting ready with Github, see our new instruction video.

Now for the challenge:

Make a virtual env and install requests. No need to install sqlite3 as it's part of the stdlib.
Copy the nba.py file over to your subdirectory.
As you can see in the template nba.py file, we've given you a headstart by importing the data and parsing the CSV into a list of named tuples.
Start coding under the "#CODE HERE" comment and complete the 7 functions we've laid out for you.
Note that there are some assert statements under main to help you validate your code.
This challenge is mainly focused on sqlite3, but if you want to use an ORM like sqlalchemy or Pandas that's fine too.

PyBites Community

A few more things before we take off:

Do you want to discuss this challenge and share your Pythonic journey with other passionate Pythonistas? Confirm your email on our platform then request access to our Slack via settings.
PyBites is here to challenge you because becoming a better Pythonista requires practice, a lot of it. For any feedback, issues or ideas use GH Issues, tweet us or ping us on our Slack.

>>>frompybitesimportBob,JulianKeepCalmandCodeinPython!

↧

Gaël Varoquaux: A foundation for scikit-learn at Inria

September 16, 2018, 3:00 pm

≫ Next: Python Anywhere: Force HTTPS on your website

≪ Previous: PyBites: Code Challenge 51 - Analyse NBA Data with SQL/sqlite3

We have just announced that a foundation will be supporting scikit-learn at Inria [1]: scikit-learn.fondation-inria.fr

Growth and sustainability

This is an exciting turn for us, because it enables us to receive private funding. As a result, we will be able to have secure employment for some existing core contributors, and to hire more people on the team. The goal is to help sustaining quality (more frequent releases?) and to tackle some ambitious features.

A foundation? What and why?

Open source lives and thrives by its base, the community of developers. And scikit-learn is a fantastic example of these dynamics. Because of its grass-root origins, it has focused on features that matter for the small and the many, such as ease of use and statistical models that work well in data-poor situations. Over the years, decisions have been based on their technical merit, rather than the importance of displaying a list of features that are trendy. A consequence of the breadth of contributors with different backgrounds is the library tends to be well-suited for many applications, including some models that are less mainstream.

People with dedicated time to support the community

That said, over time this is an increasing need for a core team of maintainers. As the library gets bigger, is it more and more difficult to have a full view of what is happening. Integration of new features, quality assurances, and releases are best done by developers who can dedicate a large amount of time to the library. Also, ambitious changes to the library, such as improving the parallel computing engine, need long efforts. For many years, we have always had people with dedicated time to support the community. In France, we were going through hoops to find public money to found them. As someone who has done this effort, I can tell you that is a complicated one [2].

The ability to receive money from sponsors will enable us to scale up our operations. I was initially worried that we would have difficulties finding partners that accepted to give us money without asking for control on the project. However, I was proven wrong, and we have found a small set of great partners.

What will people work on? How will decisions be made?

It can be a difficult exercise to balance how money is used in a community-driven project. The project should not loose its drive where the community of developers is important. Interests of the sponsors should not prime over interests of the user base.

We will make sure that the money that the foundation receives is invested for the interest of the community. We have a technical committee that supervises the activity of the foundation. Its decisions will be informed by the community [3]. For this, we have an advisory board composed of core contributors of scikit-learn. Beside the advisory board, the technical committee also comprises a delegate from each sponsor. I am excited about the input that our partners will provide us on the priorities for them, as they represent various industries. Voting power will be spread so that sponsors and community have the same voting power.

Why not an existing foundation such as NumFOCUS, or the PSF?

There are several reasons why we choose this particular legal vessel. Our endeavor is slightly from the prominent foundations in our ecosystem, NumFocus and the PSF (Python Software Foundation).

The first important aspect is that we want to employ full-time developers. Different countries have very different legal frameworks, and it is really hard to transfer money overseas in a non profit. Physical assets like employing people or owning real estate is even harder. We needed something in France. And there might be a need for something else in another country at some point.

Another reason to be embedded in the Inria foundation is that it is giving us a really good deal. We basically get legal advice, accounting, office space, and IT support, for an 8% overhead. This is an excellent deal and is part of the sponsoring efforts that Inria will keep doing.

Last, we feel that a foundation targeting specifically scikit-learn can raise money from different people than other foundations. I think that there is value having multiple foundations seeking money for open-source software. Indeed, a foundation builds a case and an image, to convince donors. Different donors require a different case and a different image. For instance the president of NumFOCUS argues for a name less focused on numerics. Yet, too wide of a scope can dilute the image.

We have in mind to make it easy for other foundations to support scikit-learn. We have majors contributors at leading institutions, such as Andreas Mueller at Columbia or Joel Nothman at Sydney university. It is important that these institutions can easily gather donations too, in the legal framework suited to their country. Hence the name reflects that the foundation is embedded at Inria, leaving room for other initiatives.

What’s the scope?

The scope of our work is everything scikit-learn related. It is not the whole pydata or scipy ecosystem: it is focused on scikit-learn. But we will not hesitate contribute fixes and enhancements to neighboring projects, like in the past, even all the way up to core Python [4].

I’m am very excited. A strong team of full-time contributors will allow us to do ambitious things with scikit-learn.

Join us

We will be recruiting! See our positions. Come work with us in Paris.

I want to end by thanking the amazing men and women who have been contributing to scikit-learn, and are with us in this fantastic adventure! The energy that is in this project is incredible. We are are launching this effort thank to you, and to empower you even more.

[1]

I am quite proud that over the years, my group has employed Olivier Grisel, Joris van den Bossche (working on pandas in addition to scikit-learn), Guillaume Lemaître (working on imbalanced-learn in addition to scikit-learn), Jérémie du Boisberranger, Tom Moreau, Loic Estève, Fabian Pedregosa, to name only a few. All these people, and the many others students that we have payed part time to work on software, have had an structuring impact on our ecosystem, going beyond the bounds of scikit-learn and touching many aspects of computing in Python. However, because of the constraints of research funding in France, public money forced my to hire them with short-term contracts.

[2]	Technically, it is a tax-deductible scikit-learn consortium inside the Inria foundation, which is an non-profit entity related to Inria.

[3]	Details on the goverance of the foundation can be found at https://scikit-learn.fondation-inria.fr/en/mission-and-governance

[4]

For instance Olivier and Tom have been making parallelism more robust in Python 3.7 (amongst various issues https://bugs.python.org/issue33056 and https://bugs.python.org/issue31699). Olivier helped defining the new pickling protocol, crucial to efficient persistence. This is hard work. Yet it is important, because it benefits all libraries.

↧

Python Anywhere: Force HTTPS on your website

September 17, 2018, 6:45 am

≫ Next: Real Python: Top 10 Must-Watch PyCon Talks

≪ Previous: Gaël Varoquaux: A foundation for scikit-learn at Inria

.jab-post img { border: 2px solid #eeeeee; padding: 5px; }

One smaller feature we added in our last system update was the ability to force HTTPS without needing to change your code. Here's a bit more information about why you might want to do that, and how it works.

Why force HTTPS?

If you're using your default yourusername.pythonanywhere.com website on PythonAnywhere, people can access it using a non-secure connection via http://yourusername.pythonanywhere.com, or using a secure connection via https://yourusername.pythonanywhere.com. Likewise, if you have a paid account and are using a custom domain, they can access it non-securely via http://www.yourdomain.com or -- if you've set up an HTTPS certificate -- they can access it securely via https://www.yourdomain.com.

Having two ways to access the site is fine, but these days there's a general move across the Internet to using HTTPS for everything. Newer versions of Chrome explicitly put the words "Not secure" in the browser's address bar when you access a non-HTTPS site, and Google also apparently rank HTTPS sites higher in search results.

So often, what you want to do is set things up so that people can only use HTTPS to access your site -- never simple non-secure HTTP. This is called "forcing HTTPS", and works by sending back a redirect when people try to access the non-secure URL. So if someone goes to http://www.yourdomain.com, they're just bounced straight over to https://www.yourdomain.com, and likewise if they go to http://www.yourdomain.com/something they're redirected to https://www.yourdomain.com/something.

How it used to work

Because forcing HTTPS is a common requirement, most of the main Python web frameworks have support for it build in -- Django has a SECURE_SSL_REDIRECT setting, for example, and there's a Flask extension called flask-sslify. In the past, we suggested that people add the appropriate configuration to their code to use those features, and it all worked fine.

There were just two problems:

If you use the static files system on the "Web" tab, it picks up any requests before they get anywhere near your code. So while any requests that went to your view code were redirected to HTTPS, your static assets would still be served over plain HTTP if that was the kind of request made by the browser.
Putting this kind of protocol-level stuff in the Python code just feels like it's happening at the wrong layer of the application. In particular, if you're testing locally then perhaps you don't want HTTPS to be forced there -- and while the framework plugins generally have support for only forcing it in non-debug environments, it can be a little fiddly.

How it works now

We made a bunch of changes to the way our loadbalancers work as part of our recent system update, primarily in order to improve the way we handle HTTPS certificates. Because we were working in that part of the code, it was easy for us to add the capability to force HTTPS to our own infrastructure. So now, in order to force HTTPS for your website, just go to the "Web" page inside PythonAnywhere, select the site on the left, scroll down to the "Security" section, and toggle it on:

You'll need to reload the site using the button at the top of the page to activate it.

Caveats

There are a couple of things to keep in mind when using this feature.

Most importantly, if you have a custom domain, you'll need to make sure you keep your HTTPS certificate updated and that it does not expire. This is because every person coming to your site will be sent to the secure version -- and if the cert has expired, they'll get a security warning, which won't look good!
A redirect message from a server only includes the URL to redirect to -- not, for example, any POSTed data. This means that we can't do a force-HTTPS redirect in response to a POST request, because the data would get lost. (It also wouldn't be super-valuable -- if the user has POSTed data, then it's already gone across the Internet unencrypted, so sending it again in encrypted form wouldn't really help matters.)

Any questions?

Hopefully that was all pretty clear, but if you do have any questions, just drop us a line at support@pythonanywhere.com or leave a comment below.

↧

Real Python: Top 10 Must-Watch PyCon Talks

September 17, 2018, 7:00 am

≫ Next: Stack Abuse: How to Format Dates in Python

≪ Previous: Python Anywhere: Force HTTPS on your website

For the past three years, I’ve had the privilege of attending the Python Conference (PyCon) in the United States. PyCon US is a yearly event where Pythonistas get together to talk and learn about Python. It’s a great place to learn, meet new fellow Python devs, and get some seriously cool swag.

The first time I attended, I quickly realized that it was more a community event than a typical conference. There were people from all over the world, from all walks of life. There were no prejudicial biases—apart from everyone knowing that Python is the best programming language out there!

Learn More:Click here to join 45,000+ Python developers on the Real Python Newsletter and get new Python tutorials and news that will make you a more effective Pythonista.

At PyCon, there are so many things you can do. The United States conference is broken up into 3 major parts:

Tutorials: A collection of classroom-like learning sessions where experts teach in depth on a particular topic
Conference:
- A selection of talks, ranging from 30 to 45 minutes in length, all throughout the day, submitted by members of the Python community
- Keynote speakers invited by the conference organizers
- A collection of 5-minute lightning talks given by any attendee who wants the spotlight (Sidenote: Docker was announced in a PyCon 2014 lightning talk.)
Sprints: A week-long event where members get to work on projects proposed by their peers

If you ever get the chance to attend a PyCon event, either in the United States or closer to where you live, I highly recommend it. Not only will you learn more about the Python language, but you’ll be able to meet with other amazing Python developers. Check out Python.org’s list of conferences to see if there are any near you.

When selecting the videos for this list, I limited myself to talks that were given at PyCon US in 2009 or later. I chose only keynote talks and talks that were 30 to 45 minutes long. I didn’t include any tutorials or lightning talks. I also tried to select videos that would stand the test of time, meaning the topics they cover will hopefully be useful for a long time for both beginners and advanced developers.

Without further ado, here’s my list of the top 10 must-watch PyCon talks.

#10: Refactoring Python: Why and How to Restructure Your Code

Brett Slatkin, PyCon 2016

Brett Slatkin is a Google engineer and the author of Effective Python. He has given many talks related to Python at both PyCon US and PyCon Montreal. In this talk, Brett takes a quick, but deep, dive into what re-factoring your code means and involves.

He also explains why refactoring your code is so important that you should spend as much—or even more—time refactoring it than actually developing it. The concepts explored in his talk are great for not only Python developers but for all software engineers.

You can find the slides to his talk here.

#9: Solve Your Problems With Sloppy Python

Larry Hastings, PyCon 2018

Larry Hastings is one of Python’s core developers and has been involved in its development since almost the beginning. He has given quite a few talks on Python at various venues, but this is the one that stands out.

In this talk, he explores when it’s okay to break “Pythonic” convention to quickly solve the problem at hand. I love this talk because it has got some great tips on how and when to break conventions as well as some other Python tricks. It’s a fun talk that is also informative.

#8: Awesome Command Line Tools

Amjith Pamaujam, PyCon 2017

Amjith Ramanujam is a Traffic Engineer at Netflix and creator of PGCLI and MYCLI, amazing interactive command line tools for Postgres and MySQL. Python developers often find themselves creating scripts or programs that require running from the command line. Amjith does a great job of exploring what makes a great command line tool by going over the design decisions made while developing these tools.

#7: Discovering Python

David Beazley, PyCon 2014

David Beazley is another Python core developer with multiple books and talks for learning about Python. I own his Python Cookbook and highly recommend it.

This talk is a little different from the others in that it doesn’t include any Python code. It’s a memoir on how he used Python to solve what would’ve been an impossible task. This talk really showcases the power of Python, a language that is easy to use and can be used to solve real-world problems.

#6: Big-O: How Code Slows as Data Grows

Ned Batchelder, PyCon 2018

Ned Batchelder is the leader of the Python Boston group and has spoken at almost every PyCon since 2009! He’s a great speaker, and I highly recommend going to any of his talks if you get the chance.

I’ve had multiple people attempt to explain what Big-O notation was and why it was important. It wasn’t until I saw Ned’s talk that I began to really grasp it. Ned does a great job of explaining it with simple examples of what Big-O means and why we, as Python developers, need to understand it.

#5: Hidden Treasures in the Standard Library

Doug Hellman, PyCon 2011

Doug Hellman is the author of the blog Python Module of the Week, which is dedicated to explaining in detail some of Python’s built-in modules. It’s a great resource, so I highly recommend that you check it out and subscribe to the feed.

This talk is the oldest in this list and is therefore a little dated in that he still uses Python 2 for the examples. However, he sheds some light on libraries that are hidden treasures and shows unique ways to use them.

You can view this talk over at PyVideo.

#4: Memory Management in Python: The Basics

Nina Zakharenko, PyCon 2016

Nina Zakharenko works for Microsoft as a Python Cloud Developer Advocate, which sounds awesome! In this PyCon 2016 talk, she explores the details of memory management within Python.

It’s common for newer Python developers to not think or care about memory management since it is handled somewhat “automagically.” But it can actually be crucial to know the basics of what is happening behind the scenes so you can learn how to write more efficient code. Nina provides us with a great start to learning these concepts.

#3: All Your Ducks in a Row: Data Structures in the Standard Library and Beyond

Brandon Rhodes, PyCon 2014

Brandon Rhodes is a Python developer at Dropbox and was the chair at PyCon 2016–2017. Whenever you want to know how data structures work, or what they do efficiently, this is the talk to view. I have it bookmarked to refer to whenever I wonder which one I should use.

#2: Beyond PEP 8: Best Practices for Beautiful Intelligible Code

Raymond Hettinger, PyCon 2015

I really could change this to “Raymond Hettinger — Any of his talks” as Raymond has a vast repertoire of great talks. But this one about going beyond PEP 8 is probably the one that is most famous and referenced most often.

Often, as Pythonistas, we get caught up in the strict rules of PEP 8 and deem anything that deviates from it to be “un-Pythonic.” Raymond instead delves into the spirit of PEP 8 and explores when it’s good to be strict about it and when it’s not.

#1: PyCon 2016 Keynote

K. Lars Lohn, PyCon 2016

A hippie biker plays the oboe and teaches life lessons using computer algorithms.

In case that hasn’t catch your attention, he also received a standing ovation at the end of his talk, which I haven’t seen happen since. I had the pleasure of personally attending this talk, which is the epitome of what the Python community is all about: unity, inclusion, and the love of solving complex problems. When I first started putting together this list, this talk immediately came to mind as the one that should be #1.

There it is, my curated list of the must-watch PyCon videos. Comment below with your favorite talks from PyCon US or other PyCons from around the world. Happy Pythoning!

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

↧

Stack Abuse: How to Format Dates in Python

September 17, 2018, 11:34 am

≫ Next: Reuven Lerner: Last chance: Weekly Python Exercise registration closes soon

≪ Previous: Real Python: Top 10 Must-Watch PyCon Talks

Introduction

Python comes with a variety of useful objects that can be used out of the box. Date objects are examples of such objects. Date types are difficult to manipulate from scratch, due to the complexity of dates and times. However, Python date objects make it extremely easy to convert dates into the desirable string formats.

Date formatting is one of the most important tasks that you will face as a programmer. Different regions around the world have different ways of representing dates/times, therefore your goal as a programmer is to present the date values in a way that is readable to the users.

For example, you may need to represent a date value numerically like "02-23-2018". On the flip side, you may need to write the same date value in a longer textual format like "Feb 23, 2018". In another scenario, you may want to extract the month in string format from a numerically formated date value.

In this article, we will study different types of date objects along with their functionalities.

The datetime Module

Python's datetime module, as you probably guessed, contains methods that can be used to work with date and time values. To use this module, we first import it via the import statement as follows:

import datetime

We can represent time values using the time class. The attributes for the time class include the hour, minute, second and microsecond.

The arguments for the time class are optional. Although if you don't specify any argument you will get back a time of 0, which is unlikely to be what you need most of the time.

For example, to initialize a time object with a value of 1 hour, 10 minutes, 20 seconds and 13 microseconds, we can run the following command:

t = datetime.time(1, 10, 20, 13)

To see the time, let's use the print function:

print(t)

Output:

01:10:20.000013

You may need to see either the hour, minute, second, or microsecond only, here is how you can do so:

print('hour:', t.hour)

Output:

hour: 1

The minutes, seconds and microseconds for the above time can be retrieved as follows:

print('Minutes:', t.minute)  
print('Seconds:', t.second)  
print('Microsecond:', t.microsecond)

Output:

Minutes: 10  
Seconds: 20  
Microseconds: 13

The values for the calendar date can be represented via the date class. The instances will have attributes for year, month, and day.

Let us call the today method to see today's date:

import datetime

today = datetime.date.today()  
print(today)

Output:

2018-09-15

The code will return the date for today, therefore the output you see will depend on the day you run the above script.

Now let's call the ctime method to print the date in another format:

print('ctime:', today.ctime())

Output:

ctime: Sat Sep 15 00:00:00 2018

The ctime method uses a longer date-time format than the examples we saw before. This method is primarily used for converting Unix-time (the number of seconds since Jan. 1st, 1970) to a string format.

And here is how we can display the year, the month, and the day using the date class:

print('Year:', today.year)  
print('Month:', today.month)  
print('Day :', today.day)

Output

Year: 2018  
Month: 9  
Day : 15

Converting Dates to Strings with strftime

Now that you know how to create Date and Time objects, let us learn how to format them into more readable strings.

To achieve this, we will be using the strftime method. This method helps us convert date objects into readable strings. It takes two parameters, as shown in the following syntax:

time.strftime(format, t)

The first parameter is the format string, while the second parameter is the time to be formatted, which is optional.

This method can also be used on a datetime object directly, as shown in the following example:

import datetime

x = datetime.datetime(2018, 9, 15)

print(x.strftime("%b %d %Y %H:%M:%S"))

Output:

Sep 15 2018 00:00:00

We have used the following character strings to format the date:

%b: Returns the first three characters of the month name. In our example, it returned "Sep"
%d: Returns day of the month, from 1 to 31. In our example, it returned "15".
%Y: Returns the year in four-digit format. In our example, it returned "2018".
%H: Returns the hour. In our example, it returned "00".
%M: Returns the minute, from 00 to 59. In our example, it returned "00".
%S: Returns the second, from 00 to 59. In our example, it returned "00".

We did not pass a time, hence the values for time are all "00". The following example shows how the time can be formatted as well:

import datetime

x = datetime.datetime(2018, 9, 15, 12, 45, 35)

print(x.strftime("%b %d %Y %H:%M:%S"))

Output:

Sep 15 2018 12:45:35

The Complete Character Code List

Other than the character strings given above, the strftime method takes several other directives for formatting date values:

%a: Returns the first three characters of the weekday, e.g. Wed.
%A: Returns the full name of the weekday, e.g. Wednesday.
%B: Returns the full name of the month, e.g. September.
%w: Returns the weekday as a number, from 0 to 6, with Sunday being 0.
%m: Returns the month as a number, from 01 to 12.
%p: Returns AM/PM for time.
%y: Returns the year in two-digit format, that is, without the century. For example, "18" instead of "2018".
%f: Returns microsecond from 000000 to 999999.
%Z: Returns the timezone.
%z: Returns UTC offset.
%j: Returns the number of the day in the year, from 001 to 366.
%W: Returns the week number of the year, from 00 to 53, with Monday being counted as the first day of the week.
%U: Returns the week number of the year, from 00 to 53, with Sunday counted as the first day of each week.
%c: Returns the local date and time version.
%x: Returns the local version of date.
%X: Returns the local version of time.

Consider the following example:

import datetime

x = datetime.datetime(2018, 9, 15)

print(x.strftime('%b/%d/%Y'))

Output:

Sep/15/2018

And here is how you can get the month only:

print(x.strftime('%B'))

Output:

September

Let us display the year:

print(x.strftime('%Y'))

Output:

In this example we have used the format code %Y. Notice that the Y is in uppercase. Now write it in lowercase:

print(x.strftime('%y'))

Output:

This time, the century has been omitted. As you can see, with these formatting codes you can represent the date-time in just about any form that you'd like.

Converting Strings to Dates with strptime

The strftime method helped us convert date objects into more readable strings. The strptime method does the opposite, that is, it takes strings and converts them into date objects that Python can understand.

Here is the syntax for the method:

datetime.strptime(string, format)

The string parameter is the value in string format that we want to convert into date format. The format parameter is the directive specifying the format to be taken by the date after the conversion.

For example, let's say we need to convert the string "9/15/18" into a datetime object.

Let's first import the datetime module. We will use the from keyword in order to be able to reference the specific module functions without the dot format:

from datetime import datetime

We can then define the date in the form of a string:

str = '9/15/18'

Python will not be able to understand the above string as a datetime until we convert it to an actual datetime object. We can successfully do so by calling the strptime method.

Execute the following command to convert the string:

date_object = datetime.strptime(str, '%m/%d/%y')

Let's now call the print function to display the string in datetime format:

print(date_object)

Output:

2018-09-15 00:00:00

As you can see, the conversion was successful!

You can see that the forward slash "/" has been used to separate the various elements of the string. This tells the strptime method what format our date is in, which in our case "/" is used as a separator.

But what if the day/month/year was separated by a "-"? Here is how you'd handle that:

from datetime import datetime

str = '9-15-18'  
date_object = datetime.strptime(str, '%m-%d-%y')

print(date_object)

Output:

2018-09-15 00:00:00

And again, thanks to the format specifier the strptime method was able to parse our date and convert it to a date object.

Conclusion

In this article, we studied how to format dates in Python. We saw how the datetime module in Python can be used for the manipulation of date and time values. The module contains a number of classes that can be used for this purpose. For example, the time class is used to represent time values while the date class is used to represent calendar date values.

↧

Reuven Lerner: Last chance: Weekly Python Exercise registration closes soon

September 17, 2018, 4:25 pm

≫ Next: Programiz: Python Variables, Constants and Literals

≪ Previous: Stack Abuse: How to Format Dates in Python

You probably want to understand Python better, use it more efficiently, and write code that you (and others) can maintain — for yourself, your current job, and your career.

Weekly Python Exercise lets you make that improvement. Over the course of a year, you learn to solve more interesting, useful, and complex problems. You’ll learn how to use decorators, generators, and comprehensions, as well as inner functions, lambdas, and magic methods.

And you’ll learn not just via your own work, but by collaborating with other Python developers around the world in our private forum. And in monthly office hours with me.

Registration for Weekly Python Exercise is closing soon, and I only open 1-2 cohorts each year. If you want to level up your Python, then WPE is the best way I know of to do so.

Combine the many benefits of WPE for your career with my money-back guarantee, my “forever free” policy gaining you entry into future cohorts, and the discounts I offer to students, retirees/pensioners, and people living in non-wealthy countries, and I hope you’ll agree that Weekly Python Exercise is a great investment.

Not sure? Or are you eligible for a discount code? Or just want to see my e-mail is handled by a bot? In any case, just e-mail me your questions.

Don’t delay. Weekly Python Exercise is starting soon — and along with it, your mastery of Python.

Click here to sign up for Weekly Python Exercise.

The post Last chance: Weekly Python Exercise registration closes soon appeared first on Lerner Consulting Blog.

↧

Programiz: Python Variables, Constants and Literals

September 17, 2018, 11:52 pm

≫ Next: Codementor: Connect Alibaba Cloud to AWS via VPN Gateway

≪ Previous: Reuven Lerner: Last chance: Weekly Python Exercise registration closes soon

In this article, you will learn about Python variables, constants, literals and their use cases.

↧

Codementor: Connect Alibaba Cloud to AWS via VPN Gateway

September 18, 2018, 12:23 am

≫ Next: Matthew Rocklin: Dask Development Log

≪ Previous: Programiz: Python Variables, Constants and Literals

By Evan Wong, Solutions Architect Multi-cloud is one of the most sought-after architecture design that bridges the benefits of having multiple technology capabilities of the providers and to avoid...

↧

Matthew Rocklin: Dask Development Log

September 16, 2018, 5:00 pm

≫ Next: Mike Driscoll: Jupyter Notebook 101: Writing Update

≪ Previous: Codementor: Connect Alibaba Cloud to AWS via VPN Gateway

This work is supported by Anaconda Inc

To increase transparency I’m trying to blog more often about the current work going on around Dask and related projects. Nothing here is ready for production. This blogpost is written in haste, so refined polish should not be expected.

Since the last update in the 0.19.0 release blogpost two weeks ago we’ve seen activity in the following areas:

Update Dask examples to use JupyterLab on Binder
Render Dask examples into static HTML pages for easier viewing
Consolidate and unify disparate documentation
Retire the hdfs3 library in favor of the solution in Apache Arrow.
Continue work on hyper-parameter selection for incrementally trained models
Publish two small bugfix releases
Blogpost from the Pangeo community about combining Binder with Dask
Skein/Yarn Update

1: Update Dask Examples to use JupyterLab extension

The new dask-labextension embeds Dask’s dashboard plots into a JupyterLab session so that you can get easy access to information about your computations from Jupyter directly. This was released a few weeks ago as part of the previous release post.

However since then we’ve hooked this up to our live examples system that lets users try out Dask on a small cloud instance using mybinder.org. If you want to try out Dask and JupyterLab together then head here:

Thanks to Ian Rose for managing this.

2: Render Dask Examples as static documentation

Using the nbsphinx Sphinx extension to automatically run and render Jupyter Notebooks we’ve turned our live examples repository into static documentation for easy viewing.

These examples are currently available at https://dask.org/dask-examples/ but will soon be available at examples.dask.org and from the navbar at all dask pages.

Thanks to Tom Augspurger for putting this together.

3: Consolidate documentation under a single org and style

Dask documentation is currently spread out in many small hosted sites, each associated to a particular subpackage like dask-ml, dask-kubernetes, dask-distributed, etc.. This eases development (developers are encouraged to modify documentation as they modify code) but results in a fragmented experience because users don’t know how to discover and efficiently explore our full documentation.

To resolve this we’re doing two things:

Moving all sites under the dask.org domain
Anaconda Inc, the company that employs several of the Dask developers (myself included) recently donated the domain dask.org to NumFOCUS. We’ve been slowly moving over all of our independent sites to use that location for our documentation.
Develop a uniform Sphinx theme dask-sphinx-theme
This has both uniform styling and also includes a navbar that gets automatically shared between the projects. The navbar makes it easy to discover and explore content and is something that we can keep up-to-date in a single repository.

You can see how this works by going to any of the Dask sites, like docs.dask.org.

Thanks to Tom Augspurger for managing this work and Andy Terrel for patiently handling things on the NumFOCUS side and domain name side.

4: Retire the hdfs3 library

For years the Dask community has maintained the hdfs3 library that allows for native access to the Hadoop file system from Python. This used Pivotal’s libhdfs3 library written in C++ and was, for a long while the only performant way to maturely manipulate HDFS from Python.

Since then though PyArrow has developed efficient bindings to the standard libhdfs library and exposed it through their Pythonic file system interface, which is fortunately Dask-compatible.

We’ve been telling people to use the Arrow solution for a while now and thought we’d now do so officially (see dask/hdfs3 #170). As of the last bugfix release Dask will use Arrow by default and, while the hdfs3 library is still available, Dask maintainers probably won’t spend much time on it in the future.

Thanks to Martin Durant for building and maintaining HDFS3 over all this time.

5: Hyper-parameter selection for incrementally trained models

In Dask-ML we continue to work on hyper-parameter selection for models that implement the partial_fit API. We’ve built algorithms and infrastructure to handle this well, and are currently fine tuning API, parameter names, etc..

If you have any interest in this process, come on over to dask/dask-ml #356.

Thanks to Tom Augspurger and Scott Sievert for this work.

6: Two small bugfix releases

We’ve been trying to increase the frequency of bugfix releases while things are stable. Since our last writing there have been two minor bugfix releases. You can read more about them here:

7: Binder + Dask

The Pangeo community has done work to integrate Binder with Dask and has written about the process here: Pangeo meets Binder

Thanks to Joe Hamman for this work and the blogpost.

8: Skein/Yarn Update

The Dask-Yarn connection to deploy Dask on Hadoop clusters uses a library Skein to easily manage Yarn jobs from Python.

Skein has seen a lot of activity over the last few weeks, including the following:

A Web UI for the project. See jcrist/skein #68
A Tensorflow on Yarn project from Criteo that uses Skein. See github.com/criteo/tf-yarn

This work is mostly managed by Jim Crist and other Skein contributors.

↧

Mike Driscoll: Jupyter Notebook 101: Writing Update

September 18, 2018, 5:30 am

≫ Next: Continuum Analytics Blog: Anaconda and Kx Systems Partner to Deliver kdb+ Database System and Related Machine Learning Libraries

≪ Previous: Matthew Rocklin: Dask Development Log

I don’t usually write about my book writing while the book is in progress on my blog, but I know some readers probably wonder why there are times where I am not writing blog posts as regularly as I usually do. The reason is usually because I am deep into writing chapters for a book and if the book’s chapters don’t translate into good blog articles, then the blog itself doesn’t get a lot of new content.

Anyway, as you may know, I am currently working on a book called Jupyter Notebook 101 which I am currently planning to release in November. I have 7 of the planned 11 chapters finished, although I plan to go over the entire book and check it for errors once it’s done. I am hoping to get the other chapters done early so I can write a few bonus chapters too, but we will see how the writing goes. On the plus side, these latter chapters will make good blog fodder, so you can expect to see some interesting articles on the Jupyter Notebook appearing on this blog in the near future.

If you’re interested in checking out the book, you can download a sample from Leanpub.

↧

Continuum Analytics Blog: Anaconda and Kx Systems Partner to Deliver kdb+ Database System and Related Machine Learning Libraries

September 18, 2018, 7:34 am

≫ Next: Caktus Consulting Group: Better Python Dependency Management with pip-tools

≪ Previous: Mike Driscoll: Jupyter Notebook 101: Writing Update

Anaconda, Inc., the most popular Python data science platform provider with 2.5 million downloads per month, is pleased to announce an exciting new partnership with Kx Systems, a provider of fast, efficient, and flexible tools for processing real-time and historical data. As part of our partnership, Anaconda has added the kdb+ database system, and related …
Read more →

The post Anaconda and Kx Systems Partner to Deliver kdb+ Database System and Related Machine Learning Libraries appeared first on Anaconda.

↧

Caktus Consulting Group: Better Python Dependency Management with pip-tools

September 18, 2018, 7:37 am

≫ Next: Thibauld Nion: Long overdue release of Yapsy

≪ Previous: Continuum Analytics Blog: Anaconda and Kx Systems Partner to Deliver kdb+ Database System and Related Machine Learning Libraries

I recently looked into whether I could use pip-tools to improve my workflow around projects' Python dependencies. My conclusion was that pip-tools would help on some projects, but it wouldn't do everything I wanted, and I couldn't use it everywhere. (I tried pip-tools version 2.0.2 in August 2018. If there are newer versions, they might fix some of the things I ran into when trying pip-tools.)

My problems

What were the problems I wanted to find solutions for, that just pip wasn't handling? Software engineer Kenneth Reitz explains them pretty well in his post, but I'll summarize here.

Let me start by briefly describing the environments I'm concerned with. First is my development environment, where I want to manage the dependencies. Second is the test environment, where I want to know exactly what packages and versions we test with, because then we come to the deployed environment, where I want to use exactly the same Python packages and versions as I've used in development and testing, to be sure no problems are introduced by an unexpected package upgrade.

The way we often handle that is to have a requirements file with every package and its version specified. We might start by installing the packages we know that we need, then saving the output of pip freeze to record all the dependencies that also got installed and their versions. Installing into an empty virtual environment using that requirements file gets us the same packages and versions.

But there are several problems with that approach.

First, we no longer know which packages in that file we originally wanted, and which were pulled in as dependencies. For example, maybe we needed Celery, but installing it pulled in a half-dozen other packages. Later we might decide we don't need Celery anymore and remove it from the requirements file, but we don't know which other packages we can also safely also remove.

Second, it gets very complicated if we want to upgrade some of the packages, for the same reasons.

Third, having to do a complete install of all the packages into an empty virtual environment can be slow, which is especially aggravating when we know little or nothing has changed, but that's the only way to be sure we have exactly what we want.

Requirements

To list my requirements more concisely:

Distinguish direct dependencies and versions from incidental
Freeze a set of exact packages and versions that we know work
Have one command to efficiently update a virtual environment to have exactly the frozen packages at the frozen versions and no other packages
Make it reasonably easy to update packages
Work with both installing from PyPI, and installing from Git repositories
Take advantage of pip's hash checking to give a little more confidence that packages haven't been modified
Support multiple sets of dependencies (e.g. dev vs. prod, where prod is not necessarily a subset of dev)
Perform reasonably well
Be stable

That's a lot of requirements. It turned out that I could meet more of them with pip-tools than just pip, but not all of them, and not for all projects.

Here's what I tried, using pip, virtualenv, and pip-tools.

How to set it up

I put the top-level requirements in requirements.in/*.txt.
To manage multiple sets of dependencies, we can include "-r file.txt", where "file.txt" is another file in requirements.in, as many times as we want. So we might have a base.txt, a dev.txt that starts with -r base.txt and then adds django-debug-toolbar etc, and a deploy.txt that starts with -r base.txt and then adds gunicorn.
There's one annoyance that seems minor at this point, but turns out to be a bigger problem: pip-tools only supports URLs in these requirements files if they're marked editable with -e.

# base.txt
Django<2.0
-e git+https://github.com/caktus/django-scribbler@v0.8.0#egg=django-scribbler

# dev.txt
-r base.txt
django-debug-toolbar

# deploy.txt
-r base.txt
gunicorn

Install pip-tools in the relevant virtual environment:

$ <venv>/bin/pip install pip-tools

Compile the requirements as follows:

$ <venv>/bin/pip-compile --output-file requirements/def.txt requirements.in/dev.txt

This looks only at the requirements file(s) we tell it to look at, and not at what's currently installed in the virtual environment. So one unexpected benefit is that pip-compile is faster and simpler than installing everything and then running pip freeze.

The output is a new requirements file at requirements/dev.txt.

pip-compile nicely puts a comment at the top of the output file to tell developers exactly how the file was generated and how to make a newer version of it.

#
# This file is autogenerated by pip-compile
# To update, run:
#
#    pip-compile --output-file requirements/dev.txt requirements.in/dev.txt
#
-e git+https://github.com/caktus/django-scribbler@v0.8.0#egg=django-scribbler
django-debug-toolbar==1.9.1
django==1.11.15
pytz==2018.5
sqlparse==0.2.4           # via django-debug-toolbar
```

Be sure requirements, requirements.in, and their contents are in version control.

How to make the current virtual environment have the same packages and versions

To update your virtual environment to match your requirements file, ensure pip-tools is installed in the desired virtual environment, then:

$ <venv>/bin/pip-sync requirements/dev.txt

And that's all. There's no need to create a new empty virtual environment to make sure only the listed requirements end up installed. If everything is already as we want it, no packages need to be installed at all. Otherwise only the necessary changes are made. And if there's anything installed that's no longer mentioned in our requirements, it gets removed.

Except ...

pip-sync doesn't seem to know how to uninstall the packages that we installed using -e <URL>. I get errors like this:

Can't uninstall 'pkgname1'. No files were found to uninstall.
Can't uninstall 'pkgname2'. No files were found to uninstall.

I don't really know, then, whether pip-sync is keeping those packages up to date. Maybe before running pip-sync, I could just

rm -rf $VIRTUAL_ENV/src

to delete any packages that were installed with -e? But that's ugly and would be easy to forget, so I don't want to do that.

How to update versions

Edit requirements.in/dev.txt if needed.
Run pip-compile again, exactly as before:

$ <venv>/bin/pip-compile--output-file requirements/dev.txt requirements.in/dev.txt

Update the requirements files in version control.

Hash checking

I'd like to use hash checking, but I can't yet. pip-compile can generate hashes for packages we will install from PyPI, but not for ones we install with -e <URL>. Also, pip-sync doesn't check hashes. pip install will check hashes, but if there are any hashes, then it will fail unless all packages have hashes. So if we have any -e <URL> packages, we have to turn off hash generation or we won't be able to pip install with the compiled requirements file. We could still use pip-sync with the requirements file, but since pip-sync doesn't check hashes, there's not much point in having them, even if we don't have any -e packages.

What about pipenv?

Pipenv promises to solve many of these same problems. Unfortunately, it imposes other constraints on my workflow that I don't want. It's also changing too fast at the moment to rely on in production.

Pipenv solves several of the requirements I listed above, but fails on these: It only supports two sets of requirements: base, and base plus dev, not arbitrary sets as I'd like. It can be very slow. It's not (yet?) stable: the interface and behavior is changing constantly, sometimes multiple times in the same day.

It also introduces some new constraints on my workflow. Primarily, it wants to control where the virtual environment is in the filesystem. That both prevents me from putting my virtual environment where I'd like it to be, and prevents me from using different virtual environments with the same working tree.

Shortcomings

pip-tools still has some shortcomings, in addition to the problems with checking hashes I've already mentioned.

Most concerning are the errors from pip-sync when packages have previously been installed using -e<URL>. I feel this is an unresolved issue that needs to be fixed.

Also, I'd prefer not to have to use -e at all when installing from a URL.

This workflow is more complicated than the one we're used to, though no more complicated than we'd have with pipenv, I don't think.

The number and age of open issues in the pip-tools git repository worry me. True, it's orders of magnitude fewer than some projects, but it still suggests to me that pip-tools isn't as well maintained as I might like if I'm going to rely on it in production.

Conclusions

I don't feel that I can trust pip-tools when I need to install packages from Git URLs.

But many projects don't need to install packages from Git URLs, and for those, I think adding pip-tools to my workflow might be a win. I'm going to try it with some real projects and see how that goes for a while.

↧

Thibauld Nion: Long overdue release of Yapsy

September 18, 2018, 2:27 pm

≫ Next: Davy Wybiral: Internet of Things Development Board: FireBeetle ESP32

≪ Previous: Caktus Consulting Group: Better Python Dependency Management with pip-tools

TL;DR: Yapsy v1.12 has been released with fixes for Python3.6 and multiprocessing on windows.

So, after 3 years ~~sleeping~~ busy with a fair bit of work and family ~~duties~~ joyful activities, I eventually got some time to actually release Yapsy – the fat-free DIY python plugin management toolkit.

There was a fair bit of contributions (compared to the modest size of the project), that I’m sorry not to have released earlier but which bring some nice polishing to yapsy.

The most prominent news I think is a better compatibility with modern Python (esp. 3.6) and the resolution of a nasty bug that made the instantiation of plugins in their own (sub)processes (with the Multiprocessing Plugin Manager) impossible on Windows (changelog below).

Changelog:

fix yapsy on python3.6
Make the test more robust to “unusual” unpacking of the module (see: https://sourceforge.net/p/yapsy/bugs/32/)
Protect against providing a single string to setPluginPlaces (see: https://sourceforge.net/p/yapsy/bugs/38/)
Enforce the exact directory list provided at construction time (see: https://sourceforge.net/p/yapsy/bugs/36/)
Make multiprocess plugin work on windows too ! (see: https://sourceforge.net/p/yapsy/bugs/33/)
add a filter-based getter selecting plugins on plugininfo properties (see: https://sourceforge.net/p/yapsy/feature-requests/16/)
Add callback_after argument to the LoadPlugins method in PluginManager (contrib https://sourceforge.net/p/yapsy/feature-requests/9/)
Rejecting a candidate should not be a warning (contrib Guillaume Binet: https://github.com/tibonihoo/yapsy/pull/7)
fix PluginFileLocator __init__ should assignment of plugin_info_cls (contrib Xuecheng Zhang: https://github.com/tibonihoo/yapsy/pull/8)

Now for what’s coming next:

in case it’s not done yet, please consider freezing the version of yapsy you depend on, because if I have time to do anything useful for it, it may very well be incompatible with Python2 and/or include a deep refactoring
if you have any interest or motivation for the future of yapsy, feel free to contact me !

↧

Davy Wybiral: Internet of Things Development Board: FireBeetle ESP32

September 19, 2018, 1:11 am

≫ Next: Real Python: Absolute vs Relative Imports in Python

≪ Previous: Thibauld Nion: Long overdue release of Yapsy

This is an IoT development board for the ESP32 that can be programmed using the Arduino IDE, MicroPython, or JavaScript using Espruino.

↧

Real Python: Absolute vs Relative Imports in Python

September 19, 2018, 7:00 am

≫ Next: Stack Abuse: NumPy Tutorial: A Simple Example-Based Guide

≪ Previous: Davy Wybiral: Internet of Things Development Board: FireBeetle ESP32

If you’ve worked on a Python project that has more than one file, chances are you’ve had to use an import statement before.

Even for Pythonistas with a couple of projects under their belt, imports can be confusing! You’re probably reading this because you’d like to gain a deeper understanding of imports in Python, particularly absolute and relative imports.

In this tutorial, you’ll learn the differences between the two, as well as their pros and cons. Let’s dive right in!

Free Bonus:5 Thoughts On Python Mastery, a free course for Python developers that shows you the roadmap and the mindset you'll need to take your Python skills to the next level.

A Quick Recap on Imports

You need to have a good understanding of Python modules and packages to know how imports work. A Python module is a file that has a .py extension, and a Python package is any folder that has modules inside it (or, in Python 2, a folder that contains an __init__.py file).

What happens when you have code in one module that needs to access code in another module or package? You import it!

How Imports Work

But how exactly do imports work? Let’s say you import a module abc like so:

importabc

The first thing Python will do is look up the name abc in sys.modules. This is a cache of all modules that have been previously imported.

If the name isn’t found in the module cache, Python will proceed to search through a list of built-in modules. These are modules that come pre-installed with Python and can be found in the Python Standard Library. If the name still isn’t found in the built-in modules, Python then searches for it in a list of directories defined by sys.path. This list usually includes the current directory, which is searched first.

When Python finds the module, it binds it to a name in the local scope. This means that abc is now defined and can be used in the current file without throwing a NameError.

If the name is never found, you’ll get a ModuleNotFoundError. You can find out more about imports in the Python documentation here!

Note: Security Concerns

Be aware that Python’s import system presents some significant security risks. This is largely due to its flexibility. For example, the module cache is writable, and it is possible to override core Python functionality using the import system. Importing from third-party packages can also expose your application to security threats.

Here are a couple of interesting resources to learn more about these security concerns and how to mitigate them:

10 common security gotchas in Python and how to avoid them by Anthony Shaw (Point 5 talks about Python’s import system.)
Episode #168: 10 Python security holes and how to plug them from the TalkPython podcast (The panelists begin talking about imports at around the 27:15 mark.)

Syntax of Import Statements

Now that you know how import statements work, let’s explore their syntax. You can import both packages and modules. (Note that importing a package essentially imports the package’s __init__.py file as a module.) You can also import specific objects from a package or module.

There are generally two types of import syntax. When you use the first one, you import the resource directly, like this:

importabc

abc can be a package or a module.

When you use the second syntax, you import the resource from another package or module. Here’s an example:

fromabcimportxyz

xyz can be a module, subpackage, or object, such as a class or function.

You can also choose to rename an imported resource, like so:

importabcasother_name

This renames the imported resource abc to other_name within the script. It must now be referenced as other_name, or it will not be recognized.

Styling of Import Statements

PEP 8, the official style guide for Python, has a few pointers when it comes to writing import statements. Here’s a summary:

Imports should always be written at the top of the file, after any module comments and docstrings.
Imports should be divided according to what is being imported. There are generally three groups:
- standard library imports (Python’s built-in modules)
- related third party imports (modules that are installed and do not belong to the current application)
- local application imports (modules that belong to the current application)
Each group of imports should be separated by a blank space.

It’s also a good idea to order your imports alphabetically within each import group. This makes finding particular imports much easier, especially when there are many imports in a file.

Here’s an example of how to style import statements:

"""Illustration of good import statement styling.Note that the imports come after the docstring."""# Standard library importsimportdatetimeimportos# Third party importsfromflaskimportFlaskfromflask_restfulimportApifromflask_sqlalchemyimportSQLAlchemy# Local application importsfromlocal_moduleimportlocal_classfromlocal_packageimportlocal_function

The import statements above are divided into three distinct groups, separated by a blank space. They are also ordered alphabetically within each group.

Absolute Imports

You’ve gotten up to speed on how to write import statements and how to style them like a pro. Now it’s time to learn a little more about absolute imports.

An absolute import specifies the resource to be imported using its full path from the project’s root folder.

Syntax and Practical Examples

Let’s say you have the following directory structure:

└── project
    ├── package1
    │   ├── module1.py
    │   └── module2.py
    └── package2
        ├── __init__.py
        ├── module3.py
        ├── module4.py
        └── subpackage1
            └── module5.py

There’s a directory, project, which contains two sub-directories, package1 and package2. The package1 directory has two files, module1.py and module2.py.

The package2 directory has three files: two modules, module3.py and module4.py, and an initialization file, __init__.py. It also contains a directory, subpackage, which in turn contains a file, module5.py.

Let’s assume the following:

package1/module2.py contains a function, function1.
package2/__init__.py contains a class, class1.
package2/subpackage1/module5.py contains a function, function2.

The following are practical examples of absolute imports:

frompackage1importmodule1frompackage1.module2importfunction1frompackage2importclass1frompackage2.subpackage1.module5importfunction2

Note that you must give a detailed path for each package or file, from the top-level package folder. This is somewhat similar to its file path, but we use a dot (.) instead of a slash (/).

Pros and Cons of Absolute Imports

Absolute imports are preferred because they are quite clear and straightforward. It is easy to tell exactly where the imported resource is, just by looking at the statement. Additionally, absolute imports remain valid even if the current location of the import statement changes. In fact, PEP 8 explicitly recommends absolute imports.

Sometimes, however, absolute imports can get quite verbose, depending on the complexity of the directory structure. Imagine having a statement like this:

frompackage1.subpackage2.subpackage3.subpackage4.module5importfunction6

That’s ridiculous, right? Luckily, relative imports are a good alternative in such cases!

Relative Imports

A relative import specifies the resource to be imported relative to the current location—that is, the location where the import statement is. There are two types of relative imports: implicit and explicit. Implicit relative imports have been deprecated in Python 3, so I won’t be covering them here.

Syntax and Practical Examples

The syntax of a relative import depends on the current location as well as the location of the module, package, or object to be imported. Here are a few examples of relative imports:

from.some_moduleimportsome_classfrom..some_packageimportsome_functionfrom.importsome_class

You can see that there is at least one dot in each import statement above. Relative imports make use of dot notation to specify location.

A single dot means that the module or package referenced is in the same directory as the current location. Two dots mean that it is in the parent directory of the current location—that is, the directory above. Three dots mean that it is in the grandparent directory, and so on. This will probably be familiar to you if you use a Unix-like operating system!

Let’s assume you have the same directory structure as before:

└── project
    ├── package1
    │   ├── module1.py
    │   └── module2.py
    └── package2
        ├── __init__.py
        ├── module3.py
        ├── module4.py
        └── subpackage1
            └── module5.py

Recall the file contents:

package1/module2.py contains a function, function1.
package2/__init__.py contains a class, class1.
package2/subpackage1/module5.py contains a function, function2.

You can import function1 into the package1/module1.py file this way:

# package1/module1.pyfrom.module2importfunction1

You’d use only one dot here because module2.py is in the same directory as the current module, which is module1.py.

You can import class1 and function2 into the package2/module3.py file this way:

# package2/module3.pyfrom.importclass1from.subpackage1.module5importfunction2

In the first import statement, the single dot means that you are importing class1 from the current package. Remember that importing a package essentially imports the package’s __init__.py file as a module.

In the second import statement, you’d use a single dot again because subpackage1 is in the same directory as the current module, which is module3.py.

Pros and Cons of Relative Imports

One clear advantage of relative imports is that they are quite succinct. Depending on the current location, they can turn the ridiculously long import statement you saw earlier to something as simple as this:

from..subpackage4.module5importfunction6

Unfortunately, relative imports can be messy, particularly for shared projects where directory structure is likely to change. Relative imports are also not as readable as absolute ones, and it’s not easy to tell the location of the imported resources.

Conclusion

Good job for making it to the end of this crash course on absolute and relative imports! Now you’re up to speed on how imports work. You’ve learned the best practices for writing import statements, and you know the difference between absolute and relative imports.

With your new skills, you can confidently import packages and modules from the Python standard library, third party packages, and your own local packages. Remember that you should generally opt for absolute imports over relative ones, unless the path is complex and would make the statement too long.

Thanks for reading!

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

↧

Stack Abuse: NumPy Tutorial: A Simple Example-Based Guide

September 19, 2018, 11:30 am

≫ Next: NumFOCUS: Xarray joins NumFOCUS Sponsored Projects

≪ Previous: Real Python: Absolute vs Relative Imports in Python

Introduction

NumPy Tutorial: A Simple Example-Based Guide

The NumPy library is a popular Python library used for scientific computing applications, and is an acronym for "Numerical Python". NumPy's operations are divided into three main categories: Fourier Transform and Shape Manipulation, Mathematical and Logical Operations, and Linear Algebra and Random Number Generation. To make it as fast as possible, NumPy is written in C and Python.

In this article, we will provide a brief introduction to the NumPy stack and we will see how the NumPy library can be used to perform a variety of mathematical tasks.

Advantages of NumPy

NumPy has several advantages over using core Python mathemtatical functions, a few of which are outlined here:

NumPy is extremely fast when compared to core Python thanks to its heavy use of C extensions.
Many advanced Python libraries, such as Scikit-Learn, Scipy, and Keras, make extensive use of the NumPy library. Therefore, if you plan to pursue a career in data science or machine learning, NumPy is a very good tool to master.
NumPy comes with a variety of built-in functionalities, which in core Python would take a fair bit of custom code.

Regarding the last point, take a look at the following script:

x = [2, 3, 4, 5, 6]  
y = [a + 2 for a in x]

Here, in order to add 2 to each element in the list x, we have to traverse the entire list and add 2 to each element individually. Now let's see how we can perform the same task with the NumPy library:

import numpy as np  
nums = np.array([2, 3, 4, 5, 6])  
nums2 = nums + 2

You can see how easy it is to add a scalar value to each element in the list via NumPy. It is not only readable, but also faster when compared to the previous code.

This is just the tip of the iceberg, in reality, the NumPy library is capable of performing far more complex operations in the blink of an eye. Let's explore some of these operations.

NumPy Operations

Before we can perform any NumPy operations, we need to install the NumPy package. To install the NumPy package, you can use the pip installer. Execute the following command to install:

$ pip install numpy

Otherwise, if you are running Python via the Anaconda distribution, you can execute the following command instead:

$ conda install numpy

Now that NumPy is installed, let's see some of the most common operations of the library.

Creating a NumPy Array

NumPy arrays are the building blocks of most of the NumPy operations. The NumPy arrays can be divided into two types: One-dimensional arrays and Two-Dimensional arrays.

There are several ways to create a NumPy array. In this section, we will discuss a few of them.

The array Method

To create a one-dimensional NumPy array, we can simply pass a Python list to the array method. Check out the following script for an example:

import numpy as np  
x = [2, 3, 4, 5, 6]  
nums = np.array([2, 3, 4, 5, 6])  
type(nums)

In the script above we first imported the NumPy library as np, and created a list x. We then passed this list to the array function of the NumPy library. Finally, we printed the type of the array, which resulted in the following output:

numpy.ndarray

If you were to print the nums array on screen, you would see it displayed like this:

array([2, 3, 4, 5, 6])

To create a two-dimensional array, you can pass a list of lists to the array method as shown below:

nums = np.array([[2,4,6], [8,10,12], [14,16,18]])

The above script results in a matrix where every inner list in the outer list becomes a row. The number of columns is equal to the number of elements in each inner list. The output matrix will look like this:

array([[ 2,  4,  6],  
       [ 8, 10, 12],
       [14, 16, 18]])

The arange Method

Another commonly used method for creating a NumPy array is the arange method. This method takes the start index of the array, the end index, and the step size (which is optional). Take a look at the following example:

nums = np.arange(2, 7)

Simple enough, right? The above script will return a NumPy array of size 5 with the elements 2, 3, 4, 5, and 6. Remember that the arange method returns an array that starts with the starting index and ends at one index less than the end index. The output of this code looks like this:

array([2, 3, 4, 5, 6])

Now let's add a step size of 2 to our array and see what happens:

nums = np.arange(2, 7, 2)

The output now looks like this:

array([2, 4, 6])

You can see that array starts at 2, followed by a step size of 2 and ends at 6, which is one less than the end index.

The zeros Method

Apart from generating custom arrays with your pre-filled data, you can also create NumPy arrays with a simpler set of data. For instance, you can use the zeros method to create an array of all zeros as shown below:

zeros = np.zeros(5)

The above script will return a one-dimensional array of 5 zeros. Print the zeros array and you should see the following:

array([0., 0., 0., 0., 0.])

Similarly, to create a two-dimensional array, you can pass both the number of rows and columns to the zeros method, as shown below:

zeros = np.zeros((5, 4))

The above script will return a two-dimensional array of 5 rows and 4 columns:

array([[0., 0., 0., 0.],  
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

The ones Method

Similarly, you can create one-dimensional and two-dimensional arrays of all ones using the ones method as follows:

ones = np.ones(5)

array([1., 1., 1., 1., 1.])

And again, for the two-dimensional array, try out the following code:

ones = np.ones((5, 4))

Now if you print the ones array on the screen, you should see the following two-dimensional array:

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]

The linspace Method

Another very useful method to create NumPy arrays is the linspace method. This method takes three arguments: a start index, end index, and the number of linearly-spaced numbers that you want between the specified range. For instance, if the first index is 1, the last index is 10 and you need 10 equally spaced elements within this range, you can use the linspace method as follows:

lin = np.linspace(1, 10, 10)

The output will return integers from 1 to 10:

array([1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])

Now let's try to create an array with 20 linearly-spaced elements between 1 and 10. Execute the following script:

lin = np.linspace(1, 10, 20)

This will result in the following array:

array([ 1.        ,  1.47368421,  1.94736842,  2.42105263,  2.89473684,  
        3.36842105,  3.84210526,  4.31578947,  4.78947368,  5.26315789,
        5.73684211,  6.21052632,  6.68421053,  7.15789474,  7.63157895,
        8.10526316,  8.57894737,  9.05263158,  9.52631579, 10.        ])

Notice that the output might look like a matrix, but actually it is a one-dimensional array. Because of the spacing issue, the elements have been displayed in multiple lines.

The eye Method

The eye method can be used to create an identity matrix, which can be very useful to perform a variety of operations in linear algebra. An identity matrix is a matrix with zeros across rows and columns except the diagonal. The diagonal values are all ones. Let's create a 4x4 identity matrix using the eye method:

idn = np.eye(4)

The resultant matrix looks like this:

array([[1., 0., 0., 0.],  
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

The random Method

Often times you will need to create arrays with random numbers. You can use the rand function of NumPy's random module to do so. Here is a simple example of the rand function:

random = np.random.rand(2, 3)

The above script returns a matrix of 2 rows and 3 columns. The matrix contains uniform distribution of numbers between 0 and 1:

array([[0.26818562, 0.65506793, 0.50035001],  
       [0.527117  , 0.445688  , 0.99661   ]])

Similarly, to create a matrix of random numbers with the Gaussian distribution (or "normal" distribution), you can instead use the randn method as shown below:

random = np.random.randn(2, 3)

Finally, to create an array of random integers, the randint method exists for such a case. The randint method takes the lower bound, upper bound, and the number of integers to return. For instance, if you want to create an array of 5 random integers between 50 and 100, you can use this method as follows:

random = np.random.randint(50, 100, 5)

In our case, the output looked like this:

array([54, 59, 84, 62, 74])

It is important to mention that these numbers are generated randomly every time you call the method, so you will see different numbers than in our example.

We saw different ways of creating Python arrays. Let's now explore some of the other array functions.

Reshaping NumPy Array

Using NumPy you can convert a one-dimensional array into a two-dimensional array using the reshape method.

Let's first create an array of 16 elements using the arange function. Execute the following code:

nums = np.arange(1, 17)

The nums array is a one-dimensional array of 16 elements, ranging from 1 to 16:

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16])

Nos let's convert it into a two-dimensional array of 4 rows and 4 columns:

nums2 = nums.reshape(4, 4)

The array now looks like this:

array([[ 1,  2,  3,  4],  
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])

It is pertinent to mention that you cannot reshape an array if the number of elements in the one-dimensional array is not equal to the product of rows and columns of the reshaped array. For instance, if you have 45 elements in a 1-d array, you cannot reshape it into a matrix of 5 row and 10 columns since a 5x10 matrix has 50 elements and the original one only has 45.

Finding Max/Min Values

You can use min/max functions to easily find the value of the smallest and largest number in your array. For our example, let's first create an array of 5 random integers:

random = np.random.randint(1, 100, 5)  
print(random)

Our array of random integers looks like this:

[51 40 84 38  1]

Remember, these numbers are generated randomly, therefore you will most likely have a different set of numbers. Let's use min and max functions to find the minimum and maxim values from the array that we just created. To do so, execute the following code to find minimum value:

xmin = random.min()  
print(xmin)

"1" will be printed in the output.

Similarly, for maximum value, execute the following code:

xmax = random.max()  
print(xmax)

The above script will return "84" as the output.

You can also find the index of the maximum and minimum values using the argmax() and argmin() functions. Take a look at the following script:

print(random.argmax())

The above script will print "2" since 84 is the largest number in the list and it is located at the second position of the array.

Similarly, the argmin() will return "4" because 1 is the smallest number and is located at the 4th position.

Array Indexing in NumPy

In order to effectively use the NumPy arrays, it is very important to understand the way the arrays are indexed, which I'll discuss in the next few sections.

Indexing with 1-D Arrays

Let's create a simple array of 15 numbers:

nums = np.arange(1, 16)

You can retrieve any element by passing the index number. Just like Python's lists, NumPy's arrays are zero-indexed. For instance, to find the element at the second index (3rd position) of the array, you can use the following syntax:

print(nums[2])

We have the digit 3 at the second index, therefore it will be printed on the screen.

You can also print a range of numbers using indexing. To get the range, you need to pass the start index and one less than the end index, separated by a colon, inside the square brackets that follow the array name. For example, to get the elements from the first to seventh index, you can use the following syntax:

print(nums[1:8])

The above script will print the integers from 2 to 8:

[2 3 4 5 6 7 8]

Here in the nums array, we have 2 at index 1 and 8 at index seven.

You can also slice an array and assign the elements of the sliced array to a new array:

nums2 = nums[0:8]  
print(nums2)

In the script above we sliced the nums array by extracting its first 8 elements. The resultant elements are assigned to the nums2 array. We then print the nums2 array to the console. The output is a new array of the first 8 numbers:

[1 2 3 4 5 6 7 8]

Indexing with 2-D Arrays

Indexing a two-dimensional NumPy array is very similar to indexing a matrix. Let's first create 3x3 two-dimensional NumPy array. To do so, run the following code:

nums2d = np.array(([1,2,3],[4,5,6],[7,8,9]))

Now let's print it out:

print(nums2d)

[[1 2 3]
 [4 5 6]
 [7 8 9]]

Like 1-D arrays, NumPy arrays with two dimensions also follow the zero-based index, that is, in order to access the elements in the first row, you have to specify 0 as the row index. Similarly to access elements in the first column, you need to specify 0 for the column index as well.

Let's retrieve an element from nums2d array, located in the first row and first column:

print(nums2d[0, 0])

You will see "1" in the output. Similarly, we can retrieve the element at the third row and third column as follows:

print(nums2d[2, 2])

You will see "9" in the output.

In addition to extracting a single element, you can extract the whole row by passing only the row index to the square brackets. For instance, the following script returns the first row from the nums2d array:

print(nums2d[0])

The output just a one-dimensional array:

[1 2 3]

Similarly to retrieve the first column only, you can use the following syntax:

print(nums2d[:,0])

The output is, again, an array, but it is a combination of the first elements of each array of the two-dimensional array:

[1 4 7]

Finally, to retrieve the elements from the first two rows and first two columns, the following syntax can be used:

print(nums2d[:2,:2])

The above script returns the following output:

[[1 2]
 [4 5]]

Arithmetic Operations with NumPy Arrays

For the examples in this section, we will use the nums array that we created in the last section.

Let's first add two arrays together:

nums3 = nums + nums

You can add two arrays together with the same dimensions. For instance, the nums array contained 15 elements, therefore we can add it to itself. The elements at the corresponding indexes will be added. Now if you print the nums3 array, the output looks like this:

[ 2  4  6  8 10 12 14 16 18 20 22 24 26 28 30]

As you can see, each position is the sum of the 2 elements at that position in the original arrays.

If you add an array with a scalar value, the value will be added to each element in the array. Let's add 10 to the nums array and print the resultant array on the console. Here is how you'd do it:

nums3 = nums + 10  
print(nums3)

And the resulting nums3 array becomes:

[11 12 13 14 15 16 17 18 19 20 21 22 23 24 25]

Subtraction, addition, multiplication, and division can be performed in the same way.

Apart from simple arithmetic, you can execute more complex functions on the Numpy arrays, e.g. log, square root, exponential, etc.

The log Function

The following code simply returns an array with the log of all elements in the input array:

nums3 = np.log(nums)  
print(nums3)

The output looks like this:

[0.         0.69314718 1.09861229 1.38629436 1.60943791 1.79175947
 1.94591015 2.07944154 2.19722458 2.30258509 2.39789527 2.48490665
 2.56494936 2.63905733 2.7080502 ]

The exp Function

The following script returns an array with exponents of all elements in the input array:

nums3 = np.exp(nums)  
print(nums3)

[2.71828183e+00 7.38905610e+00 2.00855369e+01 5.45981500e+01
 1.48413159e+02 4.03428793e+02 1.09663316e+03 2.98095799e+03
 8.10308393e+03 2.20264658e+04 5.98741417e+04 1.62754791e+05
 4.42413392e+05 1.20260428e+06 3.26901737e+06]

The sqrt Function

The following script returns an array with the square roots of all the elements in the input array:

nums3 = np.sqrt(nums)  
print(nums3)

[1.         1.41421356 1.73205081 2.         2.23606798 2.44948974
 2.64575131 2.82842712 3.         3.16227766 3.31662479 3.46410162
 3.60555128 3.74165739 3.87298335]

The sin Function

The following script returns an array with the sine of all the elements in the input array:

nums3 = np.sin(nums)  
print(nums3)

[ 0.84147098  0.90929743  0.14112001 -0.7568025  -0.95892427 -0.2794155
  0.6569866   0.98935825  0.41211849 -0.54402111 -0.99999021 -0.53657292
  0.42016704  0.99060736  0.65028784]

Linear Algebra Operations with NumPy Arrays

One of the biggest advantages of the NumPy arrays is their ability to perform linear algebra operations, such as the vector dot product and the matrix dot product, much faster than you can with the default Python lists.

Finding the Vector Dot Product

Computing the vector dot product for the two vectors can be calculated by multiplying the corresponding elements of the two vectors and then adding the results from the products.

Let's create two vectors and try to find their dot product manually. A vector in NumPy is basically just a 1-dimensional array. Execute the following script to create our vectors:

x = np.array([2,4])  
y = np.array([1,3])

The dot product of the above two vectors is (2 x 1) + (4 x 3) = 14.

Let's find the dot product without using the NumPy library. Execute the following script to do so:

dot_product = 0  
for a,b in zip(x,y):  
    dot_product += a * b

print(dot_product)

In the script above, we simply looped through corresponding elements in x and y vectors, multiplied them and added them to the previous sum. If you run the script above, you will see "14" printed to the console.

Now, let's see how we can find the dot product using the NumPy library. Look at the following script:

a = x * y  
print(a.sum())

We know that if we multiply the two NumPy arrays, the corresponding elements from both arrays are multiplied based on their index. In the script above, we simply multiplied the x and y vectors. We then call the sum method on the resultant array, which sums all the elements of the array. The above script will also return "14" in the output.

The above method is simple, however, the NumPy library makes it even easier to find the dot product via the dot method, as shown here:

print(x.dot(y))

For very large arrays you should also notice a speed improvement over our Python-only version, thanks to NumPy's use of C code to implement many of its core functions and data structures.

Matrix Multiplication

Like the dot product of two vectors, you can also multiply two matrices. In NumPy, a matrix is nothing more than a two-dimensional array. In order to multiply two matrices, the inner dimensions of the matrices must match, which means that the number of columns of the matrix on the left should be equal to the number of rows of the matrix on the right side of the product. For instance, if a matrix X has dimensions [3,4] and another matrix Y has dimensions of [4,2], then the matrices X and Y can be multiplied together. The resultant matrix will have the dimensions [3,2], which is the size of the outer dimensions.

To multiply two matrices, the dot function can be used as shown below:

X = np.array(([1,2,3], [4,5,6]))

Y = np.array(([1,2], [4,5], [7,8]))

Z = np.dot(X, Y)

print(Z)

In the script above we created a 3x2 matrix named X and a 2x3 matrix named Y. We then find the dot product of the two matrices and assigned the resultant matrix to the variable Z. Finally, we print the resultant matrix to the console. In the output you should see a 2x2 matrix as shown below:

[[30 36]
 [66 81]]

You can also multiply the two matrices element-wise. To do so, the dimensions of the two matrices must match, just like when we were adding arrays together. The multiply function is used for element-wise multiplication.

Let's try to multiply the matrices X and Y element-wise:

Z = np.multiply(X, Y)

The following error will occur when you run the above code:

ValueError: operands could not be broadcast together with shapes (2,3) (3,2)

The error occurs due to the mismatch between the dimensions of the X and Y matrices. Now, let's try multiplying the X matrix with itself using the multiply function:

Z = np.multiply(X, X)

Now if you print the Z matrix, you should see the following result:

[[ 1  4  9]
 [16 25 36]]

The X matrix was successfully able to multiple with itself because the dimensions of the multiplied matrices matched.

Finding the Inverse of a Matrix

Another very useful matrix operation is finding the inverse of a matrix. The NumPy library contains the ìnv function in the linalg module.

For our example, let's find the inverse of a 2x2 matrix. Take a look at the following code:

Y = np.array(([1,2], [3,4]))  
Z = np.linalg.inv(Y)  
print(Z)

The output of the above code looks like this:

[[-2.   1. ]
 [ 1.5 -0.5]]

Now in order to verify if the inverse has been calculated correctly, we can take the dot product of a matrix with its inverse, which should yield an identity matrix.

W = Y.dot(Z)  
print(W)

[[1.00000000e+00 1.11022302e-16]
 [0.00000000e+00 1.00000000e+00]]

And the result was as we expected. Ones in the diagonal and zeros (or very close to zero) elsewhere.

Finding the Determinant of a Matrix

The determinant of a matrix can be calculated using the det method, which is shown here:

X = np.array(([1,2,3], [4,5,6], [7,8,9]))

Z = np.linalg.det(X)

print(Z)

In the script above, we created a 3x3 matrix and found its determinant using the det method. In the output, you should see "6.66133814775094e-16".

Finding the Trace of a Matrix

The trace of a matrix is the sum of all the elements in the diagonal of a matrix. The NumPy library contains trace function that can be used to find the trace of a matrix. Look at the following example:

X = np.array(([1,2,3], [4,5,6], [7,8,9]))

Z = np.trace(X)

print(Z)

In the output, you should see "15", since the sum of the diagonal elements of the matrix X is 1 + 5 + 9 = 15.

Conclusion

Pythons NumPy library is one of the most popular libraries for numerical computing. In this article, we explored the NumPy library in detail with the help of several examples. We also showed how to perform different linear algebra operations via the NumPy library, which are commonly used in many data science applications.

While we covered quite a bit of NumPy's core functionality, there is still a lot to learn. If you want to learn more, I'd suggest you try out a course like Data Science in Python, Pandas, Scikit-learn, Numpy, Matplotlib, which covers NumPy, Pandas, Scikit-learn, and Matplotlib in much more depth than what we were able to cover here.

I would suggest you practice the examples in this article. If you are planning to start a career as a data scientist, the NumPy library is definitely one of the tools that you must need to learn to be a successful and productive member of the field.

↧