Use this tutorial to learn how to create your first Jupyter Notebook, important terminology, and how easily notebooks can be shared and published online.
The post Jupyter Notebook for Beginners: A Tutorial appeared first on Dataquest.
Use this tutorial to learn how to create your first Jupyter Notebook, important terminology, and how easily notebooks can be shared and published online.
The post Jupyter Notebook for Beginners: A Tutorial appeared first on Dataquest.
Hello everyone!
GSoC 2019 has almost come to an end! It's the time to wrap up this mega event started back in May 2019. Under the mentorship of Mentor Hayen, my learning experience has undergone a roller-coaster ride and it has not only boosted my growth as a developer but also as an individual. Over the last 3 months the followings are my major contributions to this project:
The following are in progress: * "max" and "min": PR #442 <https://github.com/Nuitka/Nuitka/pull/442> * "zip": PR #462 <https://github.com/Nuitka/Nuitka/pull/462>
And other minor doc fixes are added with their respective pull requests.
Over this period of time I have learnt a lot, starting from how to add new nodes to Nuitka and optimize them to optimization using reformulations. I plan to keep learning and contributing to Nuikta,
Thanks for stopping by!
bksahu
Targets for this week:
Almost all my targets and extended goals were achieved before 19th August, this week I have been working on documentation for scikit models in dffml. We want to make the packages docs as clear as possible before release.
About the final evaluation:
I haven't yet submitted the final evaluation at the time writing this blog. I am waiting for documentation to get finished some how in the next couple days so that I can add it too in work chart. I think I have done my work well and I have bonded with project a lot that I think I will continue contributing to this project for a long time.
Challenges:
The challenge right now is thinking of the best way to document scikit model that we are brute forcing over discussions recently. I'll post couple more final blogs soon regarding contributing to larger projects as a beginner and probably one more about how to document your code.
"Doing Math with Python" is part of No Starch Press's Python Humble Bundle. Of course, you get No Starch Press's other excellent Python books as part of the bundle. It's still on for the next 10 days!
Your purchases will help support the No Starch Foundation and Python Software Foundation.
Get the bundle here.
This week I attempted to understand and code the out of sample predictions for the MGWR model which is the third part of my proposal. After a discussion with the mentors I realized the scope of the work to be larger than I had first anticipated. I experimented with the options to implement the algorithm and with the help of the mentors we were able to center down on an approach. This part of the project will be defined as an ongoing part which I will continue to work on beyond the Google Summer of Code period.
In the coming week I will work on compiling all the results and work done so far for the Poisson and Binomial models into a Github web-page/s to submit as a work product for the final evaluations.
No major issues were encountered this week. Through experimentation and discussions I learnt the technique currently used for out of sample predictions for the GWR model which was very interesting
Looking forward to the progress update next week!
This week I compiled all the code and experimentation notebooks into a Github web page to be submitted as a part of the final work product for GSoC. The cover page includes all the links to the different parts of the project and all the experiments that went into devising the algorithm for each part.
I will continue to work on the 'ongoing' sections of the project beyond the GSoC period and will hope to continue contributing to the PySAL repository.
I wanted to compile the notebooks and markdown files into a Jupyter book but due to Jekyll dependency issues on my Windows machine was unable to do it. I then resorted to use the simple github web-pages from notebooks option and that worked out well.
I reposted this, because I called comprehensions, expressions in the older post. Sorry!
We are done,
Firstly checkout the pull request for the work product - https://github.com/scrapinghub/spidermon/pull/201
Project Repo - https://github.com/vipulgupta2048/mygsoc
All tasks have been completed as per project proposal.
Cerberus validation library has now been integrated with Spidermon and its validation pipelines. Where users would be able to test their data items on custom schemas defined by them easily and with or no configuration.
It brings me great joy to end on a fulfilling note for contributing to Spidermon and the Scrapy Project as part of Google Summer of Code 2019, I am happy and content with the work produced.
The PR includes,
For system testing, one could go ahead and use the pre-configured Quotes spider https://github.com/vipulgupta2048/testing_quotes and installing Spidermon from the master branch of my fork.
This project has been completed with long nights of reading and writing the code, learning new concepts on the fly and asking hundreds of pop-questions on Slack, that were answered duly by my mentors @ejulio @rennerocha as without their constant help, motivation, and guidance completing this uphill task wouldn't be ever possible.
Thank you all for reading,
You can check out more blogs here - https://mixstersite.wordpress.com/gsoc/
Back in 2013, I wrote Shonku in golang. It helped me in two important things:
Now, I think it worked for me. I could focus on writing the actual content of the posts than anything else. The tool has a few flaws, but, none of them had any issue with my blogging requirements. It just worked for me. I could have written it in Python (in much less time), but, learning a new language is always fun.
As I am trying to write more and more Rust, I decided to write a new tool in Rust and use that for my blog https://kushaldas.in.
This is very initial code, and you can easily figure out that I still don’t know how to write more idiomatic Rust yet. However, this works. The last couple of the posts were made using this tool, and I also regenerated the whole site in between.
The cargo build --release
command takes time, at the same time the release
binary is insanely fast.
PyCharm 2019.2.1 is available now!
You can update PyCharm by choosing Help | Check for Updates (or PyCharm | Check for Updates on macOS) in the IDE. PyCharm will be able to patch itself to the new version, there should no longer be a need to run the full installer.
If you’re on Ubuntu 16.04 or later, or any other Linux distribution that supports snap, you should not need to upgrade manually, you’ll automatically receive the new version.
I’m delighted to announce that Weekly Python Exercise is a gold sponsor of PyCon 2020, to be held in Pittsburgh, Pennsylvania. PyCon is the largest Python conference in the world, and is both fun and interesting for Python developers of all experience levels and backgrounds.
This will be the second year in a row sponsoring PyCon, and my third year attending the conference. Sponsoring means that I’ll not only be there, but that I’ll have a booth — giving away T-shirts and advertising the courses I teach at companies around the world, my online course offerings, and (of course) Weekly Python Exercise.
So if you’re a Python developer, you should attend PyCon. I promise that it’s worth attending.
And if you’re a developer who wants to become more fluent in Python, then check out Weekly Python Exercise. A new beginner-level cohort, focusing on objects, will start on September 17th. And a new advanced-level one, on a grab-bag of topics, will start in late October. Questions or comments? Just e-mail me, at reuven@lerner.co.il.
See you in Pittsburgh!
The post Proud to be sponsoring PyCon 2020 appeared first on Reuven Lerner.
You will recall my previous blog post that tried to build the necessary scaffolding for me to finally write up my 2017 PyCon Ireland keynote on the structure of the Medieval universe. It ran into several problems with matplotlib animations— but, having written that post, I realized that the problem ran deeper.
How could any animation show a Solar System, when a Solar System’s motion never exactly repeats? The orbital periods of the planets aren’t exact multiples of each other, and don’t provide a moment when the planets reach their original positions and the animation can repeat. At whatever moment an animation finished and looped back to the beginning, the planets would visibly and jarringly jump back to their original position.
But then I remembered that modern browsers support animation directly, and thought: could a python script produce an SVG diagram with a separate CSS animation for each planet, that repeated each time that specific planet finished a revolution?
The result would be an animated Solar System that fits into a few thousand bytes, would render with perfect clarity, and runs continuously for as long has the viewer was willing to watch!
But there’s a problem.
The CSS animation mechanism is perfect
for the simplest possible planetary orbit: uniform circular motion.
Here’s a simple SVG diagram in which a planet
and the line connecting it to the origin
are grouped within a single <g>
element.
%pylabinlinefromIPython.displayimportHTML𝜏=2.0*picircular_svg='''<svg version="1.1" width=220 height=220><g transform="translate(110, 110)"><circle cx=0 cy=0 r=100 stroke=lightgray stroke-width=1 fill=none /><g class="anim %s"><line x1=0 y1=0 x2=100 y2=0 stroke=lightgray /><circle cx=100 cy=0 r=5 fill=#bb0 /></g><circle cx=0 cy=0 r=3 fill=#040 /></g></svg>'''HTML(circular_svg%'stationary')
Populating the interactive namespace from numpy and matplotlib
We use translate()
to move (0,0) to the middle of the diagram
where it can serve as the circle’s center.
We paint a big circle for the orbit,
small circles to mark the orbit’s center and a planet,
and a line to link them.
Note, in passing, the great glory of embedding SVG in HTML — such elegance!
While raw SVG files are required to be noisy and bureaucratic XML,
SVG embedded in HTML is instead parsed using the same ergonomic SGML rules
that are used to parse HTML itself.
It requires no xmlns:
declaration,
no double quote noise around simple words and numbers,
and — most importantly for our purposes —
supports its own <style>
element without any CDATA
.
We can animate this SVG using just a few lines of CSS:
circular_style='''<style>.anim { animation-duration: 10s; animation-iteration-count: infinite;} .uniform { animation-name: uniform; animation-timing-function: linear;}@keyframes uniform { to {transform: rotate(360deg);}}</style>'''HTML(circular_svg%'uniform'+circular_style)
The result is perfect uniform circular motion, that will run forever in the browser without any further work.
So what's the problem?
The problem is that the ancient Greeks determined, alas, that none of the planets proceeds across the sky with uniform circular motion. The planets that come closest — the Moon and Sun — are still not perfectly uniform in their motion, running faster on one side of their orbit and slower on the other. (The Greeks defined a “planet” as any object that moves against the background of stars, so the Sun and Moon qualified as planets.)
To better model the planets’ motion, a Greek scientist named Claudius Ptolemy living in the Greek colonial city of Alexandria in Egypt invented the equant: the idea that the center of uniform circular motion might be a different point than the center of the orbit. An equant offset from the orbit’s center would make the planet faster on one side of its orbit and slower on the other side, exactly as observed.
Which raises a problem for us: can the equant be animated using CSS?
It’s easy enough, at least, to construct the static diagram. We now have two centers. The planet rides along the circle, as before, but the moving line that sweeps at a uniform rate is now centered off to the right. Its motion, happily, can be animated by the CSS we’ve already written.
equant_svg='''<svg version="1.1" width=300 height=220><g transform="translate(110, 110)"><circle cx=0 cy=0 r=2 fill=gray /><circle cx=0 cy=0 r=100 stroke=lightgray stroke-width=1 fill=none /><g class="anim %s"><circle cx=100 cy=0 r=5 fill=#bb0 /></g><g transform="translate(%s, 0)"><circle cx=0 cy=0 r=2 fill=gray /><g class="anim uniform"><line x1=0 y1=0 x2=%s y2=0 stroke=lightgray /></g></g></g></svg>'''HTML(equant_svg%('stationary',70,170))
The problem is getting the planet to move. Its motion will now have to be more complicated, because uniform circular motion won’t keep it aligned with the equant’s revolution:
HTML(equant_svg%('uniform',70,170))
As you can see, a planet with uniform circular motion runs ahead of the equant for half of its orbit and then falls behind schedule for the remaining half.
Unfortunately CSS makes no provision for this special sort of motion;
its animation-timing-function
does not support the value equant
.
But it does offer one alternative:
motion can be specified using a Bézier curve.
Can Python’s numeric ecosystem help us generate the right Bézier parameters to animate the equant?
From the Wikipedia’s entry on the Equant we can learn the formula that relates the angle M of the uniform circular motion to the angle E of the planet along its orbit:
defequant_E(M,e):returnM-arcsin(e*np.sin(M))M=linspace(0,𝜏)E=equant_E(M,0.7)# 70 pixel offset / 100 pixel circleplot(M,E);
The curve has exactly the shape we expect: the planet moves slowly at first, accelerates to its maximum speed (steepest slope) at ½𝜏 — the point on its orbit farthest from the equant — then slows again until it reaches its starting point at an angle of 𝜏.
The tool that CSS offers for controlling an animation’s speed is a cubic spline starting at the coordinate (0,0) and ending at (1,1). Substituting these fixed coordinates into the generic formula on the Wikipedia’s Bézier Curve page, we get:
defbezier(t,x1,y1,x2,y2):m=1-tp=3*m*tb,c,d=p*m,p*t,t*t*treturn(b*x1+c*x2+d,b*y1+c*y2+d)
The curve’s formula is specially designed so that the curve leaves the origin with an angle and velocity determined by the line from (0,0) to (x₁,y₁), then dives in towards its destination at an angle and speed determined by the line from (x₂,y₂) to (1,1). For example, we can ask the curve to launch out at a low angle along the x-axis then move up to the top of the diagram to finish with a similar horizontal approach to its endpoint — producing a spline that reminds us of the equant’s motion:
t=linspace(0,1)xb,yb=bezier(t,0.8,0,0.2,1)plot(xb,yb,'.-')
[<matplotlib.lines.Line2D at 0x7f617d9e7b38>]
Can Python help us refine the positions of the points (x₁,y₁) and (x₂,y₂) so the spline approximates the shape of the equant?
SciPy provides a curve_fit()
function
that searches for the parameters that will make one curve
as close as possible to the shape of a target curve,
but it carries a requirement:
the target curve and the test curve
both need to provide y values
that correspond to the same set of x’s —
or, in our case, to the same input M array.
The Bézier curve we’ve produced, alas, does not satisfy this criterion. A Bézier curve is rather independently minded about both its x’s and its y’s. Driven by its input parameter t, the Bézier’s output x and y coordinates make their own decisions about whether to space themselves close together or far apart as the curve swings from its origin to its destination. You can see that the example curve above starts at a high velocity with points spaced far apart, slows down in the middle as the curve moves steeply upward and points come close together, then finishes by accelerating again to its destination at the upper right. We need x’s spaced evenly instead.
Fortunately, NumPy has a solution! It offers an interpolation routine that takes a curve as input and uses it to interpolate y positions for a new input array x. What happens if we ask for the Bézier curve to be mapped atop our input array M?
plot(M,interp(M,xb,yb),'.-')
[<matplotlib.lines.Line2D at 0x7f617d94af60>]
Whoops.
The result looks rather comic: the entire Bézier curve has been squashed into only about one-sixth of the range of M. We forgot that our input arrays have two completely different domains: uniform circular motion M runs from 0 to 𝜏, while a CSS Bézier curve runs from 0 to 1.
The Bézier curve needs to be scaled up to cover the entire domain of M:
plot(M,interp(M,xb*𝜏,yb*𝜏),'.-')
[<matplotlib.lines.Line2D at 0x7f617d8bd748>]
To let curve_fit()
do its job successfully,
let's create a wrapper
that scales points between M₀ and M₁
into the range [0,1] for input to the Bézier function,
and whose output range [0,1] is then expanded to produce output between E₀ and E₁.
defscaled_bezier(M0,M1,E0,E1):defbez(M,x1,y1,x2,y2):t=linspace(0,1)xb,yb=bezier(t,x1,y1,x2,y2)x=(M-M0)/(M1-M0)y=interp(x,xb,yb)returnE0+(E1-E0)*yreturnbez
We can now ask SciPy to find a Bézier curve that approximates the equant:
fromscipy.optimizeimportcurve_fitbez=scaled_bezier(0,𝜏,0,𝜏)guess=[0.5,0.0,0.5,1.0]args,v=curve_fit(bez,M,E,guess,bounds=[0,1])args
array([0.47532029, 0.06537691, 0.52467971, 0.93462309])
Success! The Bézier curve, shown here by orange dots, looks like a fairly close approximation for the equant:
plot(M,E)plot(M,bez(M,*args),'.')
[<matplotlib.lines.Line2D at 0x7f617d97ed68>]
How closely will the resulting animation represent the true rotation of the equant?
max(abs(E-bez(M,*args)))*360/𝜏
1.980801996224498
This indicates that at the worst spots, this curve will lead or lag the true rotation by almost 2° of arc. Will that be noticeable? Let’s take a look! Here’s CSS that will use a custom animation function to control rotation speed:
bezier1_css='''<style>.bezier1 { animation-name: bezier1;}@keyframes bezier1 { from {animation-timing-function: %s} to {transform: rotate(360deg)}}</style>'''
And a routine to properly format the function:
defcb(args):argstr=', '.join('%.3f'%nforninargs)return'cubic-bezier(%s)'%argstrcb(args)
'cubic-bezier(0.475, 0.065, 0.525, 0.935)'
Does the result produce an equant and a planet that appear to rotate together?
css=bezier1_css%cb(args)HTML(equant_svg%('bezier1',70,170)+css)
Drat.
The result is uninspiring. The planet clearly lags behind, then moves ahead of, the target set by the sweeping line of the equant. It turns out that 2° is pretty noticeable.
There is an old rule for approximating motion with a Bézier curve: if one curve doesn’t work, then try more!
I tried two curves;
the improvement was not enough.
So I tried three,
and the result is much better.
Here’s a quick loop
that splits our curve into three
at the M values ⅓𝜏 and ⅔𝜏,
and finds the Bézier parameters for each third of the curve.
The result will be 3 keyframes
that run in succession,
each controlled by a different Bézier curve.
KEYFRAME='''\%.3f%% { transform: rotate(%.3frad); animation-timing-function: %s;}'''boundaries=[0,𝜏/3,2*𝜏/3,𝜏]keyframes=[]forM0,M1inzip(boundaries[:-1],boundaries[1:]):M=linspace(M0,M1)E=equant_E(M,0.7)bez=scaled_bezier(M0,M1,E[0],E[-1])args,v=curve_fit(bez,M,E,bounds=[0,1])percent=M0/𝜏*100keyframe=KEYFRAME%(percent,E[0],cb(args))keyframes.append(keyframe)print(''.join(keyframes))
0.000% { transform: rotate(0.000rad); animation-timing-function: cubic-bezier(0.480, 0.206, 0.681, 0.333); } 33.333% { transform: rotate(1.443rad); animation-timing-function: cubic-bezier(0.298, 0.264, 0.702, 0.736); } 66.667% { transform: rotate(4.840rad); animation-timing-function: cubic-bezier(0.319, 0.667, 0.520, 0.794); }
The CSS otherwise looks the same as before.
bezier2_css='''<style>.bezier2 { animation-name: bezier2;}@keyframes bezier2 { %s to {transform: rotate(360deg)}}</style>'''
But, this time, the visual output is stunning! The planet seems to follow the equant’s motion perfectly, moving slowly when close to the equant point, and very rapidly when on the opposite side of its orbit.
css=bezier2_css%''.join(keyframes)html=equant_svg%('bezier2',70,170)+cssHTML(html)
And the best part is that the animation does not involve matplotlib, isn’t delivered as raster graphics, and won’t involve a trade-off between file size and visual quality. Instead, we’ve delivered it to the browser using less than 1k characters!
len(html)
892
And in return we enjoy a crisp and accurate animation.
With this technique, I'm now ready to move forward! Finally, after these years of delay, I’ll use my next few blog posts to build from these simpler pieces a complete Ptolemaic model of the solar system — all animated in the browser without any matplotlib involved.
Decided to pull the plug on the Do the Work blog.
While I wanted to use it for my tiny, crazy, work in progress thoughts, I find that it was increasingly being subsumed by my new shiny Mastodon.
And as the volume of things I write now scales up, I do not want another place to maintain.
I’ve migrated all my posts here.
Will use this blog for the long and medium stuff going forward.
And the short, ooh shiny stuff is on my Mastodon account.
Read the kooky struggle there, and if you want a social account that does not do evil things to you, get an account yourself.
Thank you for coming along :)
!pip show pandas
statement in Ipython console. If it is not installed, you can install it by using the command !pip install pandas
. We are going to use dataset containing details of flights departing from NYC in 2013. This dataset has 32735 rows and 16 columns. See column names below. To import dataset, we are using read_csv( )
function from pandas package.
['year', 'month', 'day', 'dep_time', 'dep_delay', 'arr_time',
'arr_delay', 'carrier', 'tailnum', 'flight', 'origin', 'dest',
'air_time', 'distance', 'hour', 'minute']
import pandas as pd
df = pd.read_csv("https://dyurovsky.github.io/psyc201/data/lab2/nycflights.csv")
B6
with origin from JFK
airport