Some of the upcoming improvements have been covered by previous dev blogs, such as the completely reworked server infrastructure:
as well as the new Online Creation System that allows for editing in-game prototypes using a powerful menu system:
This week we welcome Jacqueline Kazil (@JackieKazil) as our PyDev of the Week! She is the co-author of Data Wrangling with Python. Jacqueline is the creator of the Mesa package. You can see what other projects she is working on by going to Github. Let’s take a few moments to get to know her better!
Can you tell us a little about yourself (hobbies, education, etc):
With a ten-month-old daughter, a day job, volunteer work, and working on my Ph.D, I have no time these days for dedicated hobbies. Sometimes I pick up the uke and try to play a song for the baby. She is too young to realize how bad I am, but my husband certainly knows.
Why did you start using Python?
I started using Python when I was studying journalism at the University of Missouri. The idea of storytelling with data was appealing to me. I was influenced by Adrian Holovaty, one of the creators of Django, who preceded me at MU. The University of Missouri is also home to Investigative Reporters and Editors and the National Institute for Computer Assisted Reporting, which fostered a community around using data to tell stories. I took at crash course class in Django at one of their conferences.
For my master’s project, I worked at The Washington Post on their politics team during the 2008 Presidential primaries building out data applications that shared things like exit poll data and Obama’s schedules.
What other programming languages do you know and which is your favorite?
Some of the languages that I have programmed include JavaScript, Go, PHP, SQL, VB, ActionScript. As for my favorite, is that a trick question?
What projects are you working on now?
I am working on one major library and that is one that I started — Mesa. It allows you to create agent-based models in Python.
Which Python libraries are your favorite (core or 3rd party)?
Besides my own library, I will say that I have recently used Click and Cookie Cutter. A classic library that I feel like I use all of the time is the Requests library.
What made you decide to write a book about Python?
The book was originally written to empower journalists to wrangle data. While writing it, I realized that other professions, such as business analysts would also be able get something from it. The target audience was individuals who want to get stuff done quickly, even if it means doing it a little messy.
What have you learned authoring a book?
Would you do anything differently if you were able to start over from scratch?
There are two things I would do differently. The first is probably consider writing it in Python 3. I originally chose Python 2.7, because Python 2.7 was easier to set up for a lot of people and was still a solid standard when we started writing. However, now Python 3 is the standard. The second thing I would do is write multiple shorter books. My first book was 500 pages. It could have easily been broken into two or three books.
Thanks for doing the interview!
@ced wrote:
We are proud to announce the release of the version 1.0.0 of python-sql.
python-sql is a library to write SQL queries in a pythonic way. It is mainly developed for Tryton but it has no external dependencies and it is agnostic to any framework or SQL database.
In addition to bug-fixes, this release contains the following improvements:
- Add Flavor filter_ to fallback to case expression
- Allow to use expression in AtTimeZone
- Add comparison predicates
- Add COLLATE
python-sql is available on PyPI: https://pypi.org/project/python-sql/1.0.0/
Posts: 1
Participants: 1
Have you ever dockerized your Celery app and wondered where the Celery worker bannker in the docker log output went? Usually, when you start your Celery worker, it comes up with a startup screen that displays useful information like version, broker, result backend, concurrency and subscribed queues.
-------------- celery@cheetah v4.2.1 (windowlicker)
---- **** -----
--- * *** * -- Linux-4.9.93-linuxkit-aufs-x86_64-with-debian-9.5 2018-09-29 15:21:28
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: __main__:0x22cf0c047b8
- ** ---------- .> transport: redis://localhost:6379/0
- ** ---------- .> results: redis://localhost:6379/0
- *** --- * --- .> concurrency: 4 (prefork)
-- ******* ---- .> task events: ON
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. tasks.task
[2018-09-29 16:30:34,998: INFO/MainProcess] Connected to redis://localhost:6379/0
Connected to redis://localhost:6379/0
[2018-09-29 16:30:36,136: INFO/MainProcess] mingle: searching for neighbors
mingle: searching for neighbors
However, when you dockerize your Celery worker (celery worker --app=worker.app --loglevel=INFO
),
start it up with docker-compose up -d
and fetch the logs
with docker logs ...
, something unexpected happens - the worker startup banner
is missing (clone the GitHub example repo to reproduce it).
[2018-09-30 09:22:51,849: INFO/MainProcess] Connected to redis://redis:6379/0
[2018-09-30 09:22:51,872: INFO/MainProcess] mingle: searching for neighbors
[2018-09-30 09:22:52,904: INFO/MainProcess] mingle: all alone
[2018-09-30 09:22:52,917: INFO/MainProcess] worker@f6d948a65d3f ready.
It gets even more mysterious if you bash into your running worker
container
to start up another Celery worker manually.
$ docker-compose exec worker bash
root@8ec76438799a:/app# celery worker --app=worker.app --loglevel=INFO
[2018-10-01 09:06:24,990: INFO/MainProcess] Customize Celery logger, default handler: <StreamHandler <stderr> (NOTSET)>
-------------- celery@8ec76438799a v4.2.0 (windowlicker)
---- **** -----
--- * *** * -- Linux-4.9.93-linuxkit-aufs-x86_64-with-debian-9.5 2018-10-01 09:06:25
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: __main__:0x7f4ef055f358
- ** ---------- .> transport: redis://redis:6379/0
- ** ---------- .> results: redis://redis:6379/0
- *** --- * --- .> concurrency: 2 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. tasks.task
[2018-10-01 09:06:25,095: INFO/MainProcess] Connected to redis://redis:6379/0
[2018-10-01 09:06:25,105: INFO/MainProcess] mingle: searching for neighbors
Yay, there it is! Well, at least there, but unfortunately no sign of this new worker banner
in the docker logs (docker-compose logs worker
). For some reason, docker does not capture this output.
What on earth is going on here? Spoiler: Event though I will show you a solution
to make the banner show up in the logs, I still haven’t fully understood the root cause. If you fully
understand what’s going on here, please come forward and comment below!.
So, in order to close in on this mystery let’s have a look at the Celery source code.
When the worker starts up, it is print(..., file=sys.__stdout__)
that is responsible for printing the
banner to the console. To /dev/stdout
which redirects to /proc/self/fd/1
to be more precise.
Which, in turn, should all be fine according to the docker docs:
By default, docker logs or docker service logs shows the command’s output just as it would appear if you ran the command interactively in a terminal.
This doesn’t seem to be the case here. Or does it… and it’s not a docker but a Python logging issue?
Let’s go back to the Celery side of things and examine the default logger by connecting the after_setup_logger
signal (for more details on Celery logging, have a look
at this blog post: 3 Strategies to Customise Celery logging handlers):
import os
import logging
from celery import Celery
from celery.signals import after_setup_logger
app = Celery()
app.conf.update({
'broker_url': os.environ['CELERY_BROKER_URL'],
'imports': ('tasks', ),
'task_serializer': 'json',
'result_serializer': 'json',
'accept_content': ['application/json'],
'result_backend ': os.environ['CELERY_RESULT_BACKEND']
})
@after_setup_logger.connect
def setup_loggers(logger, *args, **kwargs):
logger.info(f'Customize Celery logger, default handler: {logger.handlers[0]}')
This produces an extra line which tells us that the default handler is a StreamHandler which
writes to sys.stderr
.
[2018-09-30 09:32:00,130: INFO/MainProcess] Customize Celery logger, default handler: <StreamHandler <stderr> (NOTSET)>
Remember how the worker uses print(file=sys.__stdout__)
to print the banner? Even though
print
and logger
seem totally unrelated to me, adding a StreamHandler(sys.stdout)
handler
to the Celery logger (ie explicitly stream to stdout
), makes the startup banner appear in the docker logs:
@after_setup_logger.connect
def setup_loggers(logger, *args, **kwargs):
logger.addHandler(logging.StreamHandler(sys.stdout))
So we have a solution but no good answer. Is it Docker or is it Python? Are you able to find the missing jigsaw puzzle piece?
@ced wrote:
We are glad to announce the release 0.8.1 of relatorio.
Relatorio is a templating library which provides a way to easily output several kinds of files (odt, ods, png, svg, …).It is a bug-fix release which includes:
- Add support for Python 3.7
- Escape invalid XML characters
- Enforce closing tag to be the same directive as the opening
- Use compression for zip file
- Write mimetype as first file of the zip file
The package is available on https://pypi.org/project/relatorio/0.8.1/
The document is available on https://relatorio.readthedocs.io/en/0.8.1/
Posts: 1
Participants: 1
After starting your first Python project, you might realize that it is actually not that obvious to be consistent with the way you write Python code. If you collaborate with other developers, your code style might differ, and the code can become somehow unreadable.
I hate coding style discussions as much as every engineer I guess. Who has not seen hours of nitpicking on code reviews, a heated debate around the coffee machine or nerf guns battles to decide where the semicolon should be?
When I start a new project, the first thing I do is set up an automated style check. With that in place, there's no time wasted during code reviews about manually checking what's a program's good at: coding style consistency. Since coding style is a touchy subject, it's a good reason to tackle it at the beginning of the project.
Python has an amazing quality that few other languages have: it uses indentation to define blocks. While it offers a solution to the age-old question of "where should I put my curly braces?", it introduces a new question in the process: "how should I indent?".
I imagine that it was one of the first question that was raised in the community, so the Python folks, in their vast wisdom, came up with the PEP 8: Style Guide for Python Code.
This document defines the standard style for writing Python code. The list of guidelines boils down to:
import
statement and per line, at the top of the file, after comments and docstrings, grouped first by standard, then third-party, and finally local library imports.CamelCase
; suffix exceptions with Error
(if applicable); name functions in lowercase with words separated_by_underscores
; and use a leading underscore for _private
attributes or methods.These guidelines really aren't hard to follow and they make a lot of sense. Most Python programmers have no trouble sticking to them as they write code.
However, errare humanum est, and it's still a pain to look through your code to make sure it fits the PEP 8 guidelines. That's what the pycodestyle tool (formerly called pep8) is there for: it can automatically check any Python file you send its way.
$ pycodestyle hello.py
hello.py:4:1: E302 expected 2 blank lines, found 1
$ echo $?
1
pycodestyle indicates which lines and columns do not conform to PEP 8 and reports each issue with a code. Violations of MUST statements in the specification are reported as errors— their error codes start with an E. Minor issues are reported as warnings— their error codes start with a W. The three-digit code following the first letter indicates the exact kind of error or warning.
You can tell the general category of an error code at a glance by looking at the hundreds digit: for example, errors starting with E2
indicate issues with whitespace; errors starting with E3
indicate issues with blank lines; and warnings starting with W6
indicate deprecated features being used.
I advise you to consider it and run a PEP 8 validation tool against your source code on a regular basis. An easy way to do this is to integrate it into your continuous integration system: it's a good way to ensure that you continue to respect the PEP 8 guidelines in the long term.
Most open source project enforce PEP 8 conformance through automatic checks. Doing so since the beginning of the project might frustrate newcomers, but it also ensures that the codebase always looks the same in every part of the project. This is very important for a project of any size where there are multiple developers with differing opinions on whitespace ordering. You know what I mean.
It's also possible to ignore certain kinds of errors and warnings by using the --ignore
option:
$ pycodestyle --ignore=E3 hello.py
$ echo $?
0
This allows you to effectively ignore parts of the PEP 8 specification that you don't want to follow. If you're running pycodestyle on a existing code base, it also allows you to ignore certain kinds of problems so you can focus on fixing issues one category at a time.
If you write C code for Python (e.g. modules), the PEP 7 standard describes the coding style that you should follow.
Other tools also exist that check for actual coding errors rather than style errors. Some notable examples include:
These tools all make use of static analysis — that is, they parse the code and analyze it rather than running it outright.
If you choose to use pyflakes— which I recommend — note that it doesn't check PEP 8 conformance on its own — you would still pycodestyle to do that. That means you need 2 different tools to have a proper coverage.
In order to simplify things, a project named flake8 exists and combines pyflakes and pycodestyle into a single command. It also adds some new fancy features: for example, it can skip checks on lines containing # noqa
and is extensible via plugins.
There are a large number of plugins available for flake8 that you can just use. For example, installing flake8-import-order (with pip install flake8-import-order
) will extend flake8 so it also checks that your import
statements are sorted alphabetically in your source code.
flake8 is now heavily used in most open source projects for code style verification. Some large open source projects even wrote their own plugins, adding checks checks for errors such as odd usage of except
, Python 2/3 portability issues, import style, dangerous string formatting, possible localization issues, etc.
If you're starting a new project, I strongly recommend you use one of these tools and rely on it for automatic checking of your code quality and style. If you already have a codebase, a good approach is to run them with most of the warnings disabled and fix issues one category at a time.
While none of these tools may be a perfect fit for your project or your preferences, using flake8 together is a good way to improve the quality of your code and make it more durable. If nothing else, it's a good start toward that goal.
Many text editors, including the famous GNU Emacs and vim, have plugins available (such as Flycheck) that can run tools such as pep8 or flake8 directly in your code buffer, interactively highlighting any part of your code that isn't PEP 8-compliant. This is a handy way to fix most style errors as you write your code.
Today we've issued the 2.0.9 and 1.11.16 bugfix releases.
The release package and checksums are available from our downloads page, as well as from the Python Package Index. The PGP key ID used for this release is Carlton Gibson: E17DF5C82B4F9D00.
In accordance with our security release policy, the Django team is issuing Django 2.1.2. This release addresses the security issue detailed below. We encourage all users of Django to upgrade as soon as possible.
If an admin user has the change permission to the user model, only part of the password hash is displayed in the change form. Admin users with the view (but not change) permission to the user model were displayed the entire hash. While it's typically infeasible to reverse a strong password hash, if your site uses weaker password hashing algorithms such as MD5 or SHA1, it could be a problem.
Thanks Phithon Gong for reporting this issue.
Patches to resolve the issue have been applied to Django's master branch and the 2.1 release branche. The patches may be obtained from the following changesets:
The following release has been issued:
The PGP key ID used for these releases is Carlton Gibson: E17DF5C82B4F9D00.
As always, we ask that potential security issues be reported via private email to security@djangoproject.com, and not via Django's Trac instance or the django-developers list. Please see our security policies for further information.
The latest version of Mu is here! Version 1.0.1 is a bug-fix release and all but one of the thirty or more contributions, changes and fixes were made by members of our community. I’m especially proud that this update means Mu supports the following translations: German, Spanish, French, Japanese, Polish, Portuguese (including Brasil), Swedish, Vietnamese and Chinese. Apparently there are Greek and Turkish translations on the way.
We expect the next version of Mu with new features (Python 3.7, some new modes, improvements to configuration, more bug fixes and better translations) to arrive some-time towards the end of this year.
If you spot any bugs, please don’t hesitate to let us know.
Once again, many thanks to all the people in our community who have made such important and helpful contributions: René Raab, Limor Fried, GitHub user @doanminhdang, Martin Dybdal, René Dudfield, Marco A L Barbosa, Justin Riley, Filip Korling, Nick Morrott, John Guan, Filip Kłębczyk, Tim McCurrach, Damien George, Zander Brown, Carlos Pereira Atencio and Tim Golden.
In this article we review last week's Create your own Pomodoro Timer code challenge.
From now on we will merge our solution into our Community branch and include anything noteworthy here, because:
we are learning just like you, we are all equals :)
we need the PRs too ;) ... as part of Hacktoberfest No. 5 that just kicked of (5 PRs and you get a cool t-shirt)
Secondly we encourage you to send us a quotable blurb for our weekly review post (what you are reading now), see this new message on our platform's PR submit page:
Don't be shy, share your work!
Check out the awesome PRs by our community for PCC52 (or from fork: git checkout community && git merge upstream/community
):
Some cool stuff that got PR'd for this challenge: tkinter, argparse, pytest, wxPython, Django, awesome, no?!
You can look at all submitted code here and/or pulling our Community branch.
Other learnings we spotted in Pull Requests for other challenges this week: collections, itertools, xml files, list comprehensions.
Thanks to everyone for your participation in our blog code challenges!
Keep the PRs coming, again specially this Hacktoberfest month!
Subscribe to our blog (sidebar) to get a new PyBites Code Challenge (PCC) in your inbox each Monday.
And/or take any of our 50+ challenges on our platform.
Prefer coding self contained exercises in the comfort of your browser? Try our growing collection of Bites of Py.
Want to do the #100DaysOfCode but not sure what to work on? Take our course and/or start logging your progress on our platform.
Keep Calm and Code in Python!
-- Bob and Julian
It's not that I'm so smart, it's just that I stay with problems longer. - A. Einstein
Hey Pythonistas,
We bet this one's going to be music to your ears (another quality pun!).
This week, query the Spotify API and write some code that performs the set of tasks we've listed below.
If you have another music service you'd rather use, feel free!
Now for the challenge tasks:
Query Spotify and grab a list of album names for any given artist. Bonus points if the script allows the user to specify the artist!
Create a playlist on your account if you don't already have one. Write a script that queries Spotify for the playlist and returns all tracks/songs in the playlist.
This one's tricky: write a script that obtains a list of the top tracks for an artist and see if any of those songs exist in your playlist.
Feel free to dig into the API and go wild!
If you don't have an existing playlist, just create one with random artists and use it as your data set.
Hint: Check out the Spotipy Python Library to make this a little easier on yourself.
If you need help getting ready with Github, see our new instruction video.
A few more things before we take off:
Do you want to discuss this challenge and share your Pythonic journey with other passionate Pythonistas? Confirm your email on our platform then request access to our Slack via settings.
PyBites is here to challenge you because becoming a better Pythonista requires practice, a lot of it. For any feedback, issues or ideas use GH Issues, tweet us or ping us on our Slack.
>>>frompybitesimportBob,JulianKeepCalmandCodeinPython!
There are few guarantees in life: death, taxes, and programmers needing to deal with strings. Strings can come in many forms. They could be unstructured text, usernames, product descriptions, database column names, or really anything else that we describe using language.
With the near-ubiquity of string data, it’s important to master the tools of the trade when it comes to strings. Luckily, Python makes string manipulation very simple, especially when compared to other languages and even older versions of Python.
In this article, you will learn some of the most fundamental string operations: splitting, concatenating, and joining. Not only will you learn how to use these tools, but you will walk away with a deeper understanding of how they work under the hood.
Take the Quiz: Test your knowledge with our interactive “Splitting, Concatenating, and Joining Strings in Python” quiz. Upon completion you will receive a score so you can track your learning progress over time.
Free Bonus:Click here to get a Python Cheat Sheet and learn the basics of Python 3, like working with data types, dictionaries, lists, and Python functions.
In Python, strings are represented as str
objects, which are immutable: this means that the object as represented in memory can not be directly altered. These two facts can help you learn (and then remember) how to use .split()
.
Have you guessed how those two features of strings relate to splitting functionality in Python? If you guessed that .split()
is an instance method because strings are a special type, you would be correct! In some other languages (like Perl), the original string serves as an input to a standalone .split()
function rather than a method called on the string itself.
Note: Ways to Call String Methods
String methods like .split()
are mainly shown here as instance methods that are called on strings. They can also be called as static methods, but this isn’t ideal because it’s more “wordy.” For the sake of completeness, here’s an example:
# Avoid this:str.split('a,b,c',',')
This is bulky and unwieldy when you compare it to the preferred usage:
# Do this instead:'a,b,c'.split(',')
For more on instance, class, and static methods in Python, check out our in-depth tutorial.
What about string immutability? This should remind you that string methods are not in-place operations, but they return a new object in memory.
Note: In-Place Operations
In-place operations are operations that directly change the object on which they are called. A common example is the .append()
method that is used on lists: when you call .append()
on a list, that list is directly changed by adding the input to .append()
to the same list.
Before going deeper, let’s look at a simple example:
>>> 'this is my string'.split()['this', 'is', 'my', 'string']
This is actually a special case of a .split()
call, which I chose for its simplicity. Without any separator specified, .split()
will count any whitespace as a separator.
Another feature of the bare call to .split()
is that it automatically cuts out leading and trailing whitespace, as well as consecutive whitespace. Compare calling .split()
on the following string without a separator parameter and with having ' '
as the separator parameter:
>>> s=' this is my string '>>> s.split()['this', 'is', 'my', 'string']>>> s.split(' ')['', 'this', '', '', 'is', '', 'my', 'string', '']
The first thing to notice is that this showcases the immutability of strings in Python: subsequent calls to .split()
work on the original string, not on the list result of the first call to .split()
.
The second—and the main—thing you should see is that the bare .split()
call extracts the words in the sentence and discards any whitespace.
.split(' ')
, on the other hand, is much more literal. When there are leading or trailing separators, you’ll get an empty string, which you can see in the first and last elements of the resulting list.
Where there are multiple consecutive separators (such as between “this” and “is” and between “is” and “my”), the first one will be used as the separator, and the subsequent ones will find their way into your result list as empty strings.
Note: Separators in Calls to .split()
While the above example uses a single space character as a separator input to .split()
, you aren’t limited in the types of characters or length of strings you use as separators. The only requirement is that your separator be a string. You could use anything from "..."
to even "separator"
.
.split()
has another optional parameter called maxsplit
. By default, .split()
will make all possible splits when called. When you give a value to maxsplit
, however, only the given number of splits will be made. Using our previous example string, we can see maxsplit
in action:
>>> s="this is my string">>> s.split(maxsplit=1)['this', 'is my string']
As you see above, if you set maxsplit
to 1
, the first whitespace region is used as the separator, and the rest are ignored. Let’s do some exercises to test out everything we’ve learned so far.
What happens when you give a negative number as the maxsplit
parameter?
.split()
will split your string on all available separators, which is also the default behavior when maxsplit
isn’t set.
You were recently handed a comma-separated value (CSV) file that was horribly formatted. Your job is to extract each row into an list, with each element of that list representing the columns of that file. What makes it badly formatted? The “address” field includes multiple commas but needs to be represented in the list as a single element!
Assume that your file has been loaded into memory as the following multiline string:
Name,Phone,Address
Mike Smith,15554218841,123 Nice St, Roy, NM, USA
Anita Hernandez,15557789941,425 Sunny St, New York, NY, USA
Guido van Rossum,315558730,Science Park 123, 1098 XG Amsterdam, NL
Your output should be a list of lists:
[['Mike Smith','15554218841','123 Nice St, Roy, NM, USA'],['Anita Hernandez','15557789941','425 Sunny St, New York, NY, USA'],['Guido van Rossum','315558730','Science Park 123, 1098 XG Amsterdam, NL']]
Each inner list represents the rows of the CSV that we’re interested in, while the outer list holds it all together.
Here’s my solution. There are a few ways to attack this. The important thing is that you used .split()
with all its optional parameters and got the expected output:
input_string="""Name,Phone,AddressMike Smith,15554218841,123 Nice St, Roy, NM, USAAnita Hernandez,15557789941,425 Sunny St, New York, NY, USAGuido van Rossum,315558730,Science Park 123, 1098 XG Amsterdam, NL"""defstring_split_ex(unsplit):results=[]# Bonus points for using splitlines() here instead, # which will be more readableforlineinunsplit.split('\n')[1:]:results.append(line.split(',',maxsplit=2))returnresultsprint(string_split_ex(input))
We call .split()
twice here. The first usage can look intimidating, but don’t worry! We’ll step through it, and you’ll get comfortable with expressions like these. Let’s take another look at the first .split()
call: unsplit.split('\n')[1:]
.
The first element is unsplit
, which is just the variable that points to your input string. Then we have our .split()
call: .split('\n')
. Here, we are splitting on a special character called the newline character.
What does \n
do? As the name implies, it tells whatever is reading the string that every character after it should be shown on the next line. In a multiline string like our input_string
, there is a hidden \n
at the end of each line.
The final part might be new: [1:]
. The statement so far gives us a new list in memory, and [1:]
looks like a list index notation, and it is—kind of! This extended index notation gives us a list slice. In this case, we take the element at index 1
and everything after it, discarding the element at index 0
.
In all, we iterate through a list of strings, where each element represents each line in the multiline input string except for the very first line.
At each string, we call .split()
again using ,
as the split character, but this time we are using maxsplit
to only split on the first two commas, leaving the address intact. We then append the result of that call to the aptly named results
array and return it to the caller.
The other fundamental string operation is the opposite of splitting strings: string concatenation. If you haven’t seen this word, don’t worry. It’s just a fancy way of saying “gluing together.”
+
OperatorThere are a few ways of doing this, depending on what you’re trying to achieve. The simplest and most common method is to use the plus symbol (+
) to add multiple strings together. Simply place a +
between as many strings as you want to join together:
>>> 'a'+'b'+'c''abc'
In keeping with the math theme, you can also multiply a string to repeat it:
>>> 'do'*2'dodo'
Remember, strings are immutable! If you concatenate or repeat a string stored in a variable, you will have to assign the new string to another variable in order to keep it.
>>> orig_string='Hello'>>> orig_string+', world''Hello, world'>>> orig_string'Hello'>>> full_sentence=orig_string+', world'>>> full_sentence'Hello, world'
If we didn’t have immutable strings, full_sentence
would instead output 'Hello, world, world'
.
Another note is that Python does not do implicit string conversion. If you try to concatenate a string with a non-string type, Python will raise a TypeError
:
>>> 'Hello'+2Traceback (most recent call last):
File "<stdin>", line 1, in <module>TypeError: must be str, not int
This is because you can only concatenate strings with other strings, which may be new behavior for you if you’re coming from a language like JavaScript, which attempts to do implicit type conversion.
.Join()
There is another, more powerful, way to join strings together: the join()
method.
The common use case here is when you have an iterable—like a list—made up of strings, and you want to combine those strings into a single string. Like .split()
, .join()
is a string instance method. If all of your strings are in an iterable, which one do you call .join()
on?
This is a bit of a trick question. Remember that when you use .split()
, you call it on the string or character you want to split on. The opposite operation is .join()
, so you call it on the string or character you want to use to join your iterable of strings together:
>>> strings=['do','re','mi']>>> ','.join(strings)'do,re,mi'
Here, we join each element of the strings
list with a comma (,
) and call .join()
on it rather than the strings
list.
How could you make the output text more readable?
One thing you could do is add spacing:
>>> strings=['do','re','mi']>>> ', '.join(strings)'do, re, mi'
By doing nothing more than adding a space to our join string, we’ve vastly improved the readability of our output. This is something you should always keep in mind when joining strings for human readability.
.join()
is smart in that it inserts your “joiner” in between the strings in the iterable you want to join, rather than just adding your joiner at the end of every string in the iterable. This means that if you pass an iterable of size 1
, you won’t see your joiner:
>>> 'b'.join(['a'])'a'
Using our web scraping tutorial, you’ve built a great weather scraper. However, it loads string information in a list of lists, each holding a unique row of information you want to write out to a CSV file:
[['Boston','MA','76F','65% Precip','0.15 in'],['San Francisco','CA','62F','20% Precip','0.00 in'],['Washington','DC','82F','80% Precip','0.19 in'],['Miami','FL','79F','50% Precip','0.70 in']]
Your output should be a single string that looks like this:
"""Boston,MA,76F,65% Precip,0.15inSan Francisco,CA,62F,20% Precip,0.00 inWashington,DC,82F,80% Precip,0.19 inMiami,FL,79F,50% Precip,0.70 in"""
For this solution, I used a list comprehension, which is a powerful feature of Python that allows you to rapidly build lists. If you want to learn more about them, check out this great article that covers all the comprehensions available in Python.
Below is my solution, starting with a list of lists and ending with a single string:
input_list=[['Boston','MA','76F','65% Precip','0.15 in'],['San Francisco','CA','62F','20% Precip','0.00 in'],['Washington','DC','82F','80% Precip','0.19 in'],['Miami','FL','79F','50% Precip','0.70 in']]# We start with joining each inner list into a single stringjoined=[','.join(row)forrowininput_list]# Now we transform the list of strings into a single stringoutput='\n'.join(joined)print(output)
Here we use .join()
not once, but twice. First, we use it in the list comprehension, which does the work of combining all the strings in each inner list into a single string. Next, we join each of these strings with the newline character \n
that we saw earlier. Finally, we simply print the result so we can verify that it is as we expected.
While this concludes this overview of the most basic string operations in Python (splitting, concatenating, and joining), there is still a whole universe of string methods that can make your experiences with manipulating strings much easier.
Once you have mastered these basic string operations, you may want to learn more. Luckily, we have a number of great tutorials to help you complete your mastery of Python’s features that enable smart string manipulation:
Take the Quiz: Test your knowledge with our interactive “Splitting, Concatenating, and Joining Strings in Python” quiz. Upon completion you will receive a score so you can track your learning progress over time.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
@ced wrote:
We are proud to announce the 5.0 release of Tryton. This is the first Long Term Support release which means that it will be supported for 5 years.
As usual the migration from previous series is fully supported. Some manual operation may be required, see Migration from 4.8 to 5.0.Contents:
Notes for users
User Interface
New design
The design of the desktop and web client has been reworked a lot to be more pleasant and less cluttered.
We chose a new set of icons which can be fully shared between desktop and web client. So a user will not get lost when switching from one client to the other. The new icons are based on the Material Design Icons from Google which are simple and clear.
In the web client we moved the field icons inside the input widgets. This spares some extra decorations and makes it look like the desktop client.
For the desktop client we use a new standard base, which allows a better integration into the desktop environment. For example the application menu is rendered in the global toolbar on macOS.Attachments
In Tryton it is possible to attach files to any document. Until now the toolbar button poped up a dialog with a list of attached files. It required a lot of steps from the user to add a new file or to open an existing one, - even with dragging & dropping the file on the attachment button.
To speed up the tasks we changed the button to show a drop down list of all attached files. When the user clicks on a file, it opens directly. In the drop down list there are two additional operations available:
- the Add… option which opens a file selector and creates the attachment, and
- the Manage… option which opens the former popup dialog.
Keyboard Shortcuts
The shortcuts between the two clients are now unified as much as possible. In cases where a unification is not possible the web client uses the
ALT
key. E.g. the desktop client usesCTRL+w
to close a tab. This shortcut doesn’t work in the web client, because the browsers would catch it, to close the browser tab with the whole application running inside. So the web client useALT-w
instead. The keyboard shortcuts are listed by pressingF1
on the web client and shown in the form menu of the desktop client.Web Client
The web client has some missing features compared to the desktop client, as its development started later. Every release we reduce this feature-gap and this release isn’t an exception.
The desktop client has a contextual menu on relation fields which allows to open a record, launch actions and reports etc. Such a contextual menu isn’t available on the web client, but we added an option, to open the relational record in a new tab if theCTRL
key is pressed. This gives easy access to all the features in the toolbar.
The file selection widget has been simplified. It does no longer open an intermediate dialog with a file selection button. Now it opens the file selection dialog directly.
The desktop client supports the right-to-left languages since a long time, and now the web client does too.
Moreover it is now possible to toggle the menu on large screens. This allows the user to use the full screen when needed.
We updated the address bar of the browser to correspond to the current tab. The URL in the address bar can be shared to open the same tab by different users, like in the desktop client. Opening a shared URL doesn’t require a re-login, because the browser stores the session as long as it is valid. The user can also use the back and forward buttons of the browser to switch to the previous or next tab. This is useful when the user closes a tab by error.
Similar to the desktop client the web client doesn’t open the same tab twice. It puts the focus on the existing tab instead. In difference to the desktop client there is no extra option to force opening a tab multiple times, because in a browser environment the user is able to open the same URL in multiple new tabs.Financial Accounting
Accounts Removed from Product
For simplicity and responsibility separation, we removed the accounting definition like the revenue and expense accounts or the taxes from the product form in favor of a unique accounting category. This simplifies the product creation for the product administrators as they do not need any more to care about the accounting properties. The accounting categories now must be managed by accounting administrators only.
Account Evolution
In some countries, the chart of accounts evolve frequently over time. The previous solution based on the logical deletion was not flexible enough because such inactive account should still be active on some fiscal year. Instead it is now possible to define a validity period on accounts and tax codes. The accounts and codes are then visible for all the reports that cross their period.
In case an invalid account was used on a referential document like a party or an accounting category, a replacement account can be defined. This account will be used transparently instead of the original on all operational documents (like sales or purchase). This avoid the need to reconfigure all the parties or categories.Tax Export for Spain (AEAT)
The Spanish authorities have defined standard files for tax reporting. Tryton generates automatically the files: Modelo 111, Modelo 115 and Modelo 303. This simplifies the submission to the authorities website and helps to prevent errors.
Chorus Export
The French administration requests suppliers to send their invoices for French public entities through the Chorus Pro portal. Tryton can automate this task. It will send every 15 minutes all supplier invoices posted for a party configured for Chorus Pro to the portal.
But the portal requires in addition to the user/password credential, an SSL certificate signed by a recognized authority.
Chorus Pro supports many formats of invoices. We choose to implement the Cross-Industry-Invoice from UN/CEFACT. As this is a standard format, it can be also re-used for other EDI integration.AEB43 Import Statement
Tryton adds the AEB43 format to the bank statement import functionality. It is a format commonly used in Spain.
With CODA and OFX, this raises the number of supported bank statement formats to 3.
Automating the bank statements is an important benefit of Tryton because it reduces encoding errors and speeds up the update of the receivable and payable accounts.Write-off and Payment Methods
The default credit and debit account on the journal have been removed. They were used only for payment, statement and write-off. We decided that it is better to have methods that define those accounts. So we can now configure write-off methods which can be used for reconciliation but also payment methods which are used to register payments on invoices.
Another advantage is that the new behavior avoids the need to create journals per method.Dunning
Sometimes customers do not pay even after the last level of the dunning procedure. Instead of losing track of those dunnings which should require manual procedure, Tryton keeps them now under the “Final” tab until they are finally resolved.
In the same idea of supporting a manual procedure, when an email is sent for dunning but the party does not have an email set, Tryton can send the email to a fall-back address. This address may be a secretary for example which will be in charge of forwarding the dunning to the right address.Date of Asset Depreciation
The depreciation moves can be posted every month or every year. But until now it was at the anniversary of the depreciation start. It is now possible to configure if it should be posted on the first or the last day of the month and at which month of the year. This allows to conform better with the fiscal year of the company.
Sales
Grace Period
Sales now can be configured to have a grace period after the confirmation. This grace period allows to reset the sale order to draft before it will be processed. This is very useful because most of the time the user will notice his mistake just a few seconds after he clicked the confirm button.
Parent on Price List
A frequent feature requests was to have the possibility in a price list to use the price of another price list. This is now possible. If you define a parent list then you can use the
parent_unit_price
keyword in the formula.
This feature allows to define complex cascading price lists.Inventory & Stock
Counting Wizard
We provide a new way to make inventories by counting each product in the location.
The wizard starts by asking the product or lot which is counted:Then it shows the current counting result and asks for the quantity to add (by default 1 for product with unit measure of 1):
It loops over those two steps until the user finishes the wizard. Then the inventory is updated for each product with the result of the count.
Lot Unit
It is now possible to define the maximum quantity of a lot. Tryton will do its best to enforce this constraint. For example, it will not allow a shipment to contain moves of the lot with a sum greater than the quantity of the lot. Or it will not allow to create a move for a quantity greater than the lot.
This new feature can be used to define a lot as a serial number, it just requires to set 1 unit as maximum quantity.CRM
Attention Name
Sometimes it is needed to store on a party an address that is not the real address of the party. So his name can not be used as the name on the mailbox. So we added an optional “Party Name” on the address which is used to format the address and the name of the party is used as the attention name.
Purchasing
Grace Period
Just like for the sales, the purchases can be configured to have a grace period after the confirmation. This grace period allows to reset the order to draft before it will be processed.
Manufacturing
Outsourcing
It is now possible to outsource production orders. To activate this feature, the routing should have a supplier and production service configured. In this case when the production is set to waiting state, a purchase order to the supplier is created for the configured service with the quantity corresponding to the produced quantity. The purchase costs will be added to the production costs which will be used to update the cost price of the produced product.
Subscription Management
Asset
Some subscription contracts involve the renting of an company’s asset, so on the subscription service, it is now possible to define a list of assets to rent with the service.
Tryton keeps track of which asset is rent per subscription and it ensures that an asset is not rent twice for the same period.Notes for developers
The major change in this release is the migration of the full code base to Python 3 which means only Python 3.4 or later are supported. We also added support for the new version 3.7 of Python.
Transactional queue
Tryton can be configured to use workers to execute tasks asynchronously. Any Model method can be queued by calling it from the
Model.__queue__
attribute. The method must be an instance method or a class-method that takes a list of records as first argument. The other arguments must be JSON-ifiable.
The task posting can be configured using the context variable:queue_name
,queue_scheduled_at
andqueue_expected_at
. The queue dispatches the tasks evenly to the available workers and each worker uses a configurable pool of processes (by default the number of CPU) to execute them. The workers can be run on a different machine as long as they have access to the database.
If Tryton has no worker configured, the tasks will be run directly at the end of the transaction.The processing of sales and purchases have already been adapted to use the queue (see above).
Real-time notification
We added a BUS to Tryton. It allows the server to send messages to the client. It is using long polling as push mechanism.
The first usage is the possibility to send notifications which are short messages with a priority. The web client displays them by using Web Notification and the desktop client by using GNotification which unfortunately is not yet implemented on Windows nor on MacOS.
For now, the bus is disabled by default and must be activated in the configuration.New session management
The Double session timeout has been implemented. So the session now expires after 30 days. Some operations like posting an invoice or approving a payment requires a fresh session. A fresh session is a session which had no request interruption longer than 5 minutes since its creation.
When a user changes his password, all his active sessions are invalidated. This prevents any attacker who had stolen the password to keep a session active after the password change.Web client session
The sessions are now stored in the localStorage of the browser. This means that the session per server and database can be shared between tabs and survives a reload of the page.
Improved
ModelStorage.copy
The
copy
method has been extended to have more flexibility on the copy result.
The default dictionary accepts now a callable as value. It will be called for each copied record with a dictionary of the copied values. It must return the new value for the new record.
Also the default dictionary supports the dotted notation for theMany2One
keys. In such case, the value will be used as default dictionary to copy the pointed record.Extending depends on methods
The depends on methods were limited to only
on_change
oron_change_with
methods. But it showed some limitations when trying the share common code between differenton_change/_with
methods. So we extended and generalized the behavior. Now we can define dependencies on any methods and use the@fields.depends
decorator on any method.Recursive common table expression for child_of/parent_of operators
For the evaluation of the child_of and parent_of domain operator, we used recursive ORM loops (or Modified Preorder Tree Traversal when available).
Now all supported databases support the recursive common table expression (CTE), so we could use it as default implementation.
Regarding performance recursive CTE is not better than MPTT but still avoids many round-trips compared to the recursive ORM loop.Improve client loading strategy
The strategy has been improved to take into account to eagerly load only the fields that are on the same view. This new strategy is mainly useful when multiple form views are defined.
Improve index creation
It is now possible to create indexes with a SQL expression (instead of only columns) and with a where clause. This allow to create indexes tailored for some specific queries.
There is only one limitation with the SQLite back-end which can not run the creation query if it has parameters. In such case, no index is created and a warning is displayed.
Posts: 1
Participants: 1
Hacktoberfest is an amazing campaign by Digital Ocean and Github, you contribute with at least 4 open source Pull Requests and then you get a T-shirt and some stickers.
Maintainers are encourages to label and organize the issues to be worked on.
Register at: https://hacktoberfest.digitalocean.com/
I will list here some of my Projects and the issues I am expecting to get some contributions.
Dynaconf is a library for settings management in any kind of Python project.
Skills needed: CI, Python, decorators, Python Module Initialization, Data Structures and Python data Model
Issues: https://github.com/rochacbruno/dynaconf/issues
Highlights:
Flasgger is the project powering the http://httpbin.org website, it allows you to document your Flask API and serves Swagger UI.
Skills needed: API REST, Flask, OPenAPISpec, Decorators, Python data model.
Issues: https://github.com/rochacbruno/flasgger/issues
HIghlights:
Easy way to add maps to Flask views.
Skills needed: Flask, Google APIs
Issues: https://github.com/rochacbruno/Flask-GoogleMaps
Easy way to protect your Flask views with login.
Skills needed: Python, Flask, environment variables, templating, CSS, HTML, Bootstrap.
Issues: https://github.com/rochacbruno/flask_simplelogin/issues
Highlights:
Quokka is a Content Management Framework, which is in process of rewriting, the idea is having a core written in Flask, use Pelican themes and allow generation of static websites.
Skills needed: Flask, Python, Templating, MongoDB
Issues: https://github.com/rochacbruno/quokka
All the issues are good, as it is a re-writing from scratch project.
This is a Guide for Pythonistas who are learning Rust language
Issues: https://github.com/rochacbruno/py2rs/issues
The project really needs more examples to be written and also more comparison and fixes. Any kind of contributions are welcome.
This is a call 4 papers system (used as didatic example only)
Skills needed: Flask, Templating, mongoDB
It’ll soon be PyWeek! Why not use Mu for a week of Pythonic game making fun? Your prize will be the fun of taking part and the respect of your peers.
I wrote my first game as an entry for the last PyWeek, and came a respectable 8th out of 23. I was able to use Mu’s PyGameZero mode to quickly create something goofy yet fun. My starting point was the easy to follow PyGameZero tutorial. I like to think it amateur, rather than amateurish:
It was a huge amount of fun, the community involved is friendly and supportive and I’d like to encourage as many people to enter for the upcoming competition.
What’s involved?
The PyWeek challenge:
Entries must be developed in Python during the challenge, and must incorporate
some theme decided at the start of the challenge.
In this post, we will install Python 3.7.0 on Ubuntu 18.04. This is the latest version of the Python Programming Language.
Python 3.7 gives us some new features, including:
For more checkout the What’s New in Python 3.7 page.
Perform all the updates.
apt update && apt upgrade -y
Install everything we will need to build the source for Python.
apt install build-essential -y apt install libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev -y
First, download the source tarball.
wget https://www.python.org/ftp/python/3.7.0/Python-3.7.0.tgz
Unzip the tarball.
tar -xzvf Python-3.7.0.tgz
Next, we need to configure the build.
./configure --enable-optimizations
This will enable a release build of the code.
The down side is that it will take a while to build because it will be running tests that will optimize the binary to run faster.
If you don’t care then you can run configure without the –enable-optimizations flag.
Next we need to build the binaries.
make
Once the build is done we can install the binaries.
make install
Python 3.7 is now installed.
# python3 Python 3.7.0 (default, Oct 1 2018, 13:10:35) [GCC 7.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>>
Since Python 3.3 venv has been included with Python.
To create a virtual environment use the -m venv command line option.
$ python3 -m venv ../venv/py37test
Now you can activate the environment with:
$ source ../venv/py37test/bin/activate
You now have a Python 3.7 virtual environment.
(py37test) bill@dev:~/Development/python/py37test$ python --version Python 3.7.0
The virtual environment is complete with Pip version 10.0.1.
$ pip --version pip 10.0.1 from /home/bill/Development/python/venv/py37test/lib/python3.7/site-packages/pip (python 3.7) (py37test) bill@dev:~/Development/python/py37test$
Checkout these awesome Python courses on Pluralsight. Pluralsight has great video training courses at a great price. I use Pluralsight every time I need to learn something new.
In this post we learned how to install Python 3.7.0 on Ubuntu 18.04.
If you liked this post then please share and comment.
The post Install Python 3.7.0 on Ubuntu 18.04 appeared first on AdminTome Blog.
This is my monthly Debian LTS report.
Uploaded DLA-1519-1 and DLA-1520-1 to fix
CVE-2018-1000802, CVE-2017-1000158, CVE-2018-1061 and
CVE-2018-1060 in Python 2.7 and 3.4. The latter three were
originally marked as no-dsa
but the fix was trivial to backport. I
also found that CVE-2017-1000158 was actually relevant for 3.4 even
though it was not marked as such in the tracker.
CVE-2018-1000030 was skipped because the fix was too intrusive and unclear.
Security support for Thunderbird and Firefox versions from jessie has stopped upstream. Considering that the Debian security team bit the bullet and updated those in stretch, the consensus seems to be that the versions in jessie will also be updated, which will break third-party extensions in jessie.
One of the main victims of the XULocalypse is Enigmail, which completely stopped working after the stretch update. I looked at how we could handle this. I first proposed to wait before trying to patch the Enigmail version in jessie since it would break when the Thunderbird updates will land. I then detailed five options for the Enigmail security update:
update GnuPG 2 in jessie-security to work with Enigmail, which could break unrelated things
same as 1, but in jessie-backports-sloppy
package the JavaScript dependencies to ship Enigmail with OpenPGP.js correctly.
remove Enigmail from jessie
backport only some patches to GPG 2 in jessie
I then looked at helping the Enigmail maintainers by reviewing the OpenPGP.js packaging through which I found a bug in the JavaScript packaging toolchain, which diverged into a patch in npm2deb to fix source package detection and an Emacs function to write to multiple files. (!!) That work was not directly useful to Jessie, I must admit, but it did end up clarifying which dependencies were missing for OpenPGP to land, which were clearly out of reach of a LTS update.
Switching gears, I tried to help the maintainer untangle the
JavaScript mess between multiple copies of code in TB, FF (with
itself), and Enigmail's process handling routines; to call GPG
properly with multiple file descriptors for password, clear-text,
statusfd
, and output; to have Autocrypt be able to handle "Autocrypt
Setup Messages" (ASM) properly (bug #908510); to finally make the
test suite pass. The alternative here would be to simply rip Autocrypt
out of Enigmail for the jessie update, but this would mean diverging
significantly from the upstream version.
Reports of Enigmail working with older versions of GPG are deceiving, as that configuration introduces unrelated security issues (T4017 and T4018 in upstream's bugtracker).
So much more work remains on backporting Enigmail, but I might work for the stable/unstable updates to complete before pushing that work further. Instead, I might focus on the Thunderbird and Firefox updates next.
I worked more on the GnuTLS research as a short followup to our previous discussion.
I wrote the researchers who "still stand behind what is written in the paper" and believe the current fix in GnuTLS is incomplete. GnuTLS upstream seems to agree, more or less, but point out that the fix, even if incomplete, greatly reduces the scope of those vulnerabilities and a long-term fix is underway.
Next step, therefore, is deciding if we backport the patches or just upgrade to the latest 3.3.x series, as the ABI/API changes are minor (only additions).
completed the work on gdm3 and git-annex by uploading DLA-1494-1 and DLA-1495-1
fixed Debian bug #908062 in devscripts to make dch
generate proper
version numbers since jessie was released
checked with the Spamassessin maintainer regarding the LTS update and whether we just use 3.4.2 across all suites
reviewed and testedHugo's work on 389-ds. That involved getting familiar with that "other" slapd server (apart from OpenLDAP) which I did not know about.
checked that kdepim doesn't load external content so it is not vulnerable to EFAIL by default. The proposed upstream patch changes the API so that work is postponed.
triaged the Xen security issues by severity
filed bugs about Docker security issues (CVE-2017-14992 and CVE-2018-10892)
I have, this month again, been quite spread out on many unrelated projects unfortunately.
I've played around with the latest attempt from the free software community to come up with a "federation" model to replace Twitter and other social networks, Mastodon. I've had an account for a while but I haven't talked about it much here yet.
My Mastodon account is linked with my Twitter account through some unofficial Twitter cross-posting app which more or less works. Another "app" I use is the toot client to connect my website with Mastodon through feed2exec.
And because all of this social networking stuff is just IRC 2.0, I read it all through my IRC client, thanks to Bitlbee and Mastodon is (thankfully) no exception. Unfortunately, there's a problem in my hosting provider's configuration which has made it impossible to read Mastodon status from Bitlbee for a while. I've created a test profile on the main Mastodon instance to double-check, and indeed, Bitlbee works fine there.
Before I figured that out, I tried upgrading the Bitlbee Mastodon bridge (for which I also filed a RFP) and found a regression has been introduced somewhere after 1.3.1. On the plus side, the feature request I filed to allow for custom visibility statuses from Bitlbee has been accepted, which means it's now possible to send "private" messages from Bitlbee.
Those messages, unfortunately, are not really private: they are visible to all followers, which, in the social networking world, means a lot of people. In my case, I have already accepted over a dozen followers before realizing how that worked, and I do not really know or trust most of those people. I have still 15 pending follow requests which I don't want to approve until there's a better solution, which would probably involve two levels of followship. There's at least one proposal to fix this already.
Another thing I'm concerned about with Mastodon is account migration: what happens if I'm unhappy with my current host? Or if I prefer to host it myself? My online identity is strongly tied with that hostname and there doesn't seem to be good mechanisms to support moving around Mastodon instances. OpenID had this concept of delegation where the real OpenID provider could be discovered and redirected, keeping a consistent identity. Mastodon's proposed solutions seem to aim at using redirections or at least informing users your account has moved which isn't as nice, but might be an acceptable long-term compromise.
Finally, it seems that Mastodon will likely end up in the same space as email with regards to abuse: we are already seeing block lists show up to deal with abusive servers, which is horribly reminiscent of the early days of spam fighting, where you could keep such lists (as opposed to bayesian or machine learning). Fundamentally, I'm worried about the viability of this ecosystem, just like I'm concerned about the amount of fake news, spam, and harassment that takes place on commercial platforms. One theory is that the only way to fix this is to enforce two-way sharing between followers, the approach taken by Manyverse and Scuttlebutt.
Only time will tell, I guess, but Mastodon does look like a promising platform, at least in terms of raw numbers of users...
I've started switching towards ptpb.pw as a pastebin. Besides the unfortunate cryptic name, it's a great tool: multiple pastes are deduplicated, large pastes are allowed, there is a (limited) server-side viewing mechanism (allowing for some multimedia), etc. The only things missing are "burn after reading" (one-shot links) and client-side encryption yet the latter is planned.
I like the simplistic approach to the API that makes it easy to use from any client. I've submitted the above feature request and a trivial patch so far.
I've done a few reviews and sponsoring of Emacs List Packages ("ELPA") for Debian, mostly for packages I requested myself but who were so nicely made by Nicolas (elpa-markdown-toc, elpa-auto-dictionary). To better figure out which packages are missing, I wrote this script to parse the output from an ELPA and compare it with what is in Debian. This involved digging deep into the API of the Debian archive, which in turn was useful for the JavaScript work previously mentioned. The result is in the firefox page which lists all the extensions I use and their equivalent in Debian.
I'm not very happy with the script: it's dirty, and I feel dirty. It seems to me this should be done on the fly, through some web service, and should support multiple languages. It seems we are constantly solving this problem for each ecosystem while the issues are similar...
I went down another rabbit hole after learning about Mozilla's plan to
force more or less mandatory telemetry in future versions of
Firefox. That got me thinking of how many such sniffers were in
Firefox and I was in for a bad surprise. It took about a day to
establish a (probably incomplete) list of settings necessary to
disable all those trackers in a temporary profile starter,
originally designed as a replacement for chromium --temp-profile
but
which turned out to be a study of Firefox's sins.
There are over a hundred of about:config
settings that need to be
tweaked if someone wants to keep their privacy intact in Firefox. This
is especially distressing because Mozilla prides itself on its privacy
politics. I've documented this in the Debian wiki as well.
Ideally, there would be a one-shot toggle to disable all those things. Instead, Mozilla is forcing us to play "whack-a-mole" as they pop out another undocumented configuration item with every other release.
migrated from once to pass-otp for one time password storage
provided some documentationfixes for Riseup after struggling to use their (various) VPN services
published a LWN article about archival which led to a bunch of
issues and pull requests, but also a NEW (as in, not in yet)
Debian package for the ia
archive.org commandline client
finally forked off my Docker Airsonic image after upstreamexplained they don't use Subsonic anymore
published undertime 1.4.0, only with minor changes
contributed to, and sponsored, the new version of the xscreensaver package in Debian
sponsored the new tuptime release
tried to push the rapid-photo-downloader update, only to see the current maintainer orphan the package. fortunately, someone else volunteered for maintenance and the package will not be orphaned long.
discussed with the notmuch upstream regarding background OpenPGP key updates and attachment checks, now picked up by David Edmondson, thanks!
researched the history of the venerable fortune
command and
how it could be improved
pushed a few commits to monkeysign to try and fix the numerous RC bugs threatening its inclusion in Debian Buster, still incomplete (Debian bug #899060, Debian bug #902367, Debian bug #841208)
Any application that communicates with other systems or services will at some point require a credential or sensitive piece of information to operate properly. The question then becomes how best to securely store, transmit, and use that information. The world of software secrets management is vast and complicated, so in this episode Brian Kelly, engineering manager at Conjur, aims to help you make sense of it. He explains the main factors for protecting sensitive information in your software development and deployment, ways that information might be leaked, and how to get the whole team on the same page.
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Contents
Note
I wrote this as part of a discussion recently, and I think it makes sense to share it here. This is a lot text though, feel free to skip forward.
Indeed I have a private repo, where I push and only private CI picks up. Based on Buildbot, I run many more compilations, basically around the clock on all of my computers, to find regressions from new optimization or codegen changes, and well UI changes too.
Public CI offerings like Travis are not aimed at allowing this many compilations. It will be a while before public cloud infrastructure will be donated to Nuitka, although I see it happening some time in the future. This leaves developers with the burden to run tests on their own hardware, and never enough. Casual contributors will never be able to do it themselves.
My scope is running the CPython test suites on Windows and Linux. These are the adapted 26, 27, 32, 33, 34, 35, 36, 37 suites, and also to get even more errors covered, they are ran with mismatching Python versions, so a lot of exceptions are raised. Often running the 36 tests with 37 and vice versa will extend the coverage, because of the exceptions being raise.
On Windows I compile with and without debug mode, x86 and x64, and it's kind of getting too much. For Linux I have 2 laptops in use, and an ARM CuBox bought from your donations, there it's working better, esp. due to ccache being used everywhere, although recent investigations show room for improvement there as well.
For memory usage I still compile mercurial and observe the memory it used in addition to comparing the mercurial tests to expected outputs its test suite gives. It's a sad day when Mercurial tests find changes in behavior, and luckily that has been rare. Running the Mercurial test suite gives some confidence in the thing not corrupting data it works with without knowing.
Caching the CPython outputs of tests to compare against is something I am going to make operational these days, trying to make things ever faster. There is no point to re-run tests with Python, just to get at its output, which will typically not change at all.
But for the time being, ccache.exe and clcache.exe seem to have done wonders for Windows too, but I will want to investigate some more to avoid unnecessary cache misses.
As for my workflow with Nuitka, I often tend to let some commits settle in my private repo only until they become trusted. Other times I will make bigger changes and put them out to factory immediately, because it will be hard to split up the changes later, so putting them out makes it easier.
I am more conservative with factory right after telling people to try something there. But also I break it on purpose, just trying out something. I really consider it a private branch for interacting with me or public CI. I do not recommend to use it, and it's like a permanent pull request of mine that is not ever going to be finished.
Then on occasions I am making a sorting of all commits on factory and split it into some things that become hotfixes, some things that become current pre-release, and other things that will remain in that proving ground. That is why I typically make hotfix and pre-release at the same times. The git flow suggests doing that and it's easy, so why not. As a bonus, develop is then practically stable at nearly all times too, with hardly any regressions.
I do however normally not take things as hotfixes that are on develop already, I hate the duplication of commits. Hotfixes must be small and risk free, and easy to put out, when there is any risk, it definitely will be on develop. Nuitka stable typically covers nearly all grounds already. No panic needed to add missing stuff and break others.
For me the git bisect is very important. My private commit history is basically a total mess and worthless, but on factory I am making very nice organized commits that I will frequently amend, even for the random PyLint cleanup. This allows me when e.g. one test suddenly says "segfault" on Windows to easily find the change that triggers it, look at C code difference, and spot the bug introduced, then amend the commit and be done with it.
It's amazing how much time this can save. My goal is to always have a workable state which is supposed to pass all tests. Obviously I cannot prove it for every commit, but when I know it to not be the case, I tend to make rebases. At times I have been tempted and followed up on backward amending develop and even stable.
I am doing that to be sure to have that bisect ability, but fortunately it's rare that kind of bug occurs, and I try not to do it.
As with recent changes, I sometimes make changes with the isExperimental() marker, activating breaking changes only gradually. The C bool type code generation has been there for months in a barely useful form, until it became more polished, and always guarded with a switch, until one day for 0.6 finally I changed it, and made the necessary fixes retroactively before that switch, to make it work while that was still in factory.
Then I will remove the experimental code. I feel it's very important and even ideal to be able to always compare outputs to a fully working solution. I am willing to postpone some cleanups until later date as a price, but when then something in my mind tells me again "This cannot possibly have ever worked"... a command line flag away, I have the answer to compare, plus, that includes extra changes happened in the meantime, they don't add noise to diff outputs of generated C code for example.
Then looking at that diff, I can tell where the unwanted effect is, and fix all the things, and that way find bugs much faster.
Even better, if I decide to make a cleanup action as part of making a change more viable to execute, then I get to execute it on stable grounds, covered by the full test suite. I can complete that cleanup, e.g. using variable identifier objects instead of mere strings was needed to make "heap generators" more workable. But I was able to put that one to active before "heap generators" was ever fully workable, and complete it, and actually reap some of its benefits already.
Obviously this takes a lot of hardware and CPU to be able to compile this much Python code on a regular basis. And I really wish I could add one of the new AMD Threadripper 2 to the mix. Anybody donating one to me? Yes I know, I am only dreaming. But it would really help the cause.
So the 0.6 is out, and already a hotfix that addresses mostly use cases of people that didn't work. More people seemed to have tried out 0.6.0 and as a result 0.6.0.1 is going to cover a few corner cases. So far I have not encountered a single regression of 0.6.0, but instead it contained ones for 0.5.33 which did have one that was not easy to fix.
So that went really smooth.
The UI needs more work still. Specifically that packages do not automatically include all stuff below them and have to be specified by file path instead of by name, is really annoying to me.
But I had delayed 0.6 for some UI work, and the quirks are to remain some. I will work on these things eventually.
So I updated the website to state that PyStone is now 312% faster, from a number that was very old. I since then ran it with an updated version for Python3, and it's much less there. That is pretty sad.
I will be looking into that for 0.6.1 release, or I will have to update the wording to provide 2 numbers there, because it seems for Python3 performance with Nuitka it might be misleading.
Something with unicode strings and in-place operations is driving me crazy. Nuitka is apparently slower for that, and I can't point where that is happening exactly. It seems internally unicode objects are maybe put into a different state from some operations, which then making in-place extending in realloc fail more often, but I cannot know yet.
So more work has been put into those, adding more specialization, and esp. also applying them for module variables as well. CPython can do that, and actually is giving itself a hard time about it, and Nuitka should be doing this much clever with its more static knowledge.
But I cannot tell you how much scratching my head was wasted debugging that. I was totally stupid about how I approached that, looking from the final solution, it was always easy. Just not for me apparently.
Talked about those above. So the top level logging module of your own was working fine in accelerated mode, but for standalone it failed and used the one from standard library instead. That kind of shadowing happened because Nuitka was going from module objects to their names and back to objects, which are bad in case of duplicates. That is fixed for develop, and one of those risk cases, where it cannot be a hotfix because it touched too much.
Then pure Python3 packages need not have __init__.py and so far that was best working for sub-packages, but after 0.6.0.1 hotfix, now it will also work for the main module you compile to be that empty.
So instructions have been provided how to properly make that work for Python standalone on Windows. I have yet to live up to my promise and make Nuitka automatically include the necessary files. I hope to do it for 0.6.1 though.
So I am looking at ccache on Linux right now, and found e.g. that it was reporting that gcc --version was called a lot at startup of Scons and then g++ --version once. The later is particularly stupid, because we are not going to use g++ normally, except if gcc is really old and does not support C11. So in case a good one was found, lets disable that version query and not do it.
And for the gcc version output, monkey patching scons to a version of getting that output that caches the result, removes those unnecessary forks.
So ccache is being called less frequently, and actually these --version outputs appears to actually take measurable time. It's not dramatic, but ccache was apparently getting locks, and that's worth avoiding by itself.
That said, the goal is for ccache and clcache to make them both report their effectiveness of cache usage after the end of a test suite run. That way I am hoping to notice and be able to know, if caching is used to its full effect.
I continue to be very active there. I put out a poll about the comment system, and disabling Disqus comments as a result, I will focus on Twitter for web site comments too now.
And lets not forget, having followers make me happy. So do re-tweets.
Adding Twitter more prominently to the web site is something that is also going to happen.
If you are interested, I am tagging issues help wanted and there is a bunch, and very likely at least one you can help with.
Nuitka definitely needs more people to work on it.
Working on the 0.6.1 release, attacking more in-place add operations as a first goal, and now turning to binary operations, I am trying to shape how using different helper functions to different object types looks like. And to gain performance without C types. But ultimately the same issue will arise there, what to do with mixed input types.
My desire is for in-place operations to fully catch up with CPython, as these can easily loose a lot of performance. Closure variables and their cells are another target to pick on, and I feel they ought to be next after module ones are now working, because also their solution ought to be very similar. Then showing that depending on target storage, local, closure, or module, is then faster in all cases would be a goal for the 0.6.1 release.
This feels not too far away, but we will see. I am considering next weekend for release.
If you want to help, but cannot spend the time, please consider to donate to Nuitka, and go here: