Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 22463

Catalin George Festila: The pattern python module - part 001.

$
0
0
This is a very short presentation of pattern python module.
This python module is full of options and features.
I will try to show you some parts useful for most python users.
About pattern python module:
Pattern is a web mining module for the Python programming language.
It has tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, clustering, SVM), network analysis and visualization.
Pattern developer documentation
ModuleFunctionality
pattern.webAsynchronous requests, web services, web crawler, HTML DOM parser.
pattern.dbWrappers for databases (MySQL, SQLite) and CSV-files.
pattern.textBase classes for parsers, parse trees and sentiment analysis.
pattern.searchPattern matching algorithm for parsed text (syntax & semantics).
pattern.vectorVector space model, clustering, classification.
pattern.graphGraph analysis & visualization.

I used with Fedora linux and you can see the instalation of this python module:
[root@localhost ~]# pip install pattern
Collecting pattern
Downloading pattern-2.6.zip (24.6MB)
100% |████████████████████████████████| 24.6MB 61kB/s
Installing collected packages: pattern
Running setup.py install for pattern ... done
Successfully installed pattern-2.6

Frequently used single character variable names:
VariableMeaningExample
aarray, alla = [normalize(w) for w in words]
bbooleanwhile b is False:
ddistance, documentd = distance(v1, v2)
eelemente = html.find('#nav')
ffile, filter, functionf = open('data.csv', 'r')
iindexfor i in range(len(matrix)):
jindexfor j in range(len(matrix[i])):
kkeyfor k in vector.keys():
nlist lengthn = len(a)
pparser, patternp = pattern.search.compile('NN')
qqueryfor r in twitter.search(q):
rresult, rowfor r in csv('data.csv):
sstrings = s.decode('utf-8').strip()
ttimet = time.time() - t0
vvalue, vectorfor k, v in vector.items():
wwordfor i, w in enumerate(sentence.words):
xhorizontal positionnode.x = 0
yvertical positionnode.y = 0
Pattern contains part-of-speech taggers for a number of languages (including English, Spanish, German, French and Dutch). Part-of-speech tagging is useful in many data mining tasks. A part-of-speech tagger takes a string of text and identifies the sentences and the words in the text along with their word type. 


LanguageCodeSpeakersExample countries
Spanishes350MArgentina (40), Colombia (40), Mexico (100), Spain (45)
Englishen340MCanada (30), United Kingdom (60), United States (300)
Germande100MAustria (10), Germany (80), Switzerland (7)
Frenchfr70MFrance (65), Côte d'Ivoire (20)
Italianit60MItaly (60)
Dutchnl27MThe Netherlands (25), Belgium (6), Suriname (1)
import pattern.en  
import pattern.es
import pattern.du
import pattern.de
You can deal with many websites, see examples:
from pattern.web import Wikipedia
from pattern.web import Yahoo
from pattern.web import Twitter
from pattern.web import Facebook
from pattern.web import Flickr
from pattern.web import GMAIL
from pattern.web import GOOGLE
Now, about pattern.db.
The pattern.db module contains wrappers for databases (SQLite, MySQL), Unicode CSV files and Python's datetime. It offers a convenient way to work with tabular data, for example retrieved with the pattern.web module.
import pattern 
from pattern.db import Database, field, pk, STRING, BOOLEAN, DATE, NOW
db = Database('people')
db.create('area_people',fields=(
pk(),
field('name', STRING(80), index=True),
field('type', STRING(20)),
field('date_birth', DATE, default=None),
field('date_created', DATE, default=NOW)
))
db.area_people.append(name=u'George', type='male')
1
print db.area_people.rows()[0]
(1, u'George', u'male', None, Date('2017-03-06 22:38:13'))


Viewing all articles
Browse latest Browse all 22463

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>