Background

Sometimes you find yourself needing to work at the byte-level in an application you are working on. I feel that in Python there are not enough examples of how to do this. There is also a lot of potential to over-complicate the solution.

This Example

I plan to cover several aspects of working with bytes in this example. I’ll cover working with the struct package, the bytearray built-in and the ctypes module.

A statement on copying

Be careful with slicing a python bytearray, bytes or array.array. Slicing creates a copy and can impact the performance of your application. There is a better way; enter memoryview. Memoryview works with anything that implements the Python Buffer Protocol and makes slicing very efficient. Slicing a memoryview will result in another memoryview, not a copy of the bytes represented.

A Python Oddity

I was using Python to encode a CSV file using a custom dialect recently when I noticed something odd. I noticed that the csv writer class takes an optional argument that enables you to change the line terminator. That’s ok, however, the csv reader class does not honor the argument. So, through the default api, you can create CSV that you cannot read back in with the default api. According to the documentation, this applies to Python 2.7 through 3.7. It also probably applies to versions < 2.7 but that documentation is no longer online.

Python.org Documentation

A Simple Script

I wrote the following simple script to illustrate this oddity. Basically it has a list of lists that it converts to CSV and back using various line terminators, if the output differs from the source, the difference is displayed.

Python CSV Oddity

Python

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

importcsv

importio

importre

defconvert_to_csv_and_back(rows:list,lineterminator:str):

withio.StringIO()aso:

writer=csv.writer(o,lineterminator=lineterminator)

forrow inrows:

writer.writerow(row)

withio.StringIO(o.getvalue())asi:

reader=csv.reader(i,lineterminator=lineterminator)

returnlist(reader)

source=[

['this','is','row','1'],

['this','is','row','2'],

['this','is','row','3']

]

lineterminators=['|',':','\t','\r\n']

forterminator inlineterminators:

output=convert_to_csv_and_back(source,terminator)

source_set=set(map(tuple,source))

output_set=set(map(tuple,output))

difference=source_set.symmetric_difference(output_set)

to_from_csv_failed=len(difference)

print(f'Line Terminator: {repr(terminator)}')

print(f'To/From CSV Worked: {"No" if to_from_csv_failed else "Yes"}')

print(f'Source: {source}')

print(f'Output: {output}')

ifto_from_csv_failed:

print(f'Difference: {difference}')

print('--------------------------')

Results

As you can see from the results here; with the exception of ‘\r\n’, all the various line terminators failed.

Closing Thoughts

So, why would you want to specify the line terminators with the CSV module? There are probably only a handful of reasons, my only reason would be a custom dialect where I wanted to ensure that there were no carriage returns or line feeds in. 99.9% of the time, you want ‘\r\n’.

The August 21, 2017 Solar Eclipse as photographed from Sparta, TN (35.9727° N, 85.5638° W). This was about 3.2 miles from the center of Totality! These images do not convey the magnitude of the experience. Knowing what to expect in general does not prepare you. If I tried to find a single word to describe it, I would pick astonishing.

Awhile back I had some thoughts on communication. If you’ve ever played World of Tanks Blitz you’d know that basically its a team of tanks against another team of tanks. With the pick up, fast paced nature communication is minimal at best (sometimes limited to a single “<<<<<<<<<” or “>>>>>>>>>” indicating which direction to take the offense). I found that teams that could coordinate with minimal communication, play their tank roles (scouts, mediums, heavies, and destroyers), and move fast could achieve massive overwhelming victories. Something similar is probably true in an agile/teamwork environment. Know your stuff, know your role, take opportunities, work together, succeed.

I finally got a chance to research exactly what was wrong. Given that Python 3.5 was working on my machine and 3.6 was working on several linux machines, I thought that it could have been an issue with Python 3.6, unlikely as it may be. After no bug fix version was released I realized that it must be my configuration.

Searching, I found one result related to this in the Python issue tracker: Issue 28150. One answerer points out that Python on macOS no longer relies on Apple’s version of OpenSSL, instead it is shipped with a new one. The gotcha: this new one does not have trust certificates installed. All this is detailed in the Readme.

The Installer Readme

So, as the Python issue tracker mentioned, there is an entry in the installer readme. Perhaps I should read these closer…

Certificate verification and OpenSSL

**NEW** This variant of Python 3.6 now includes its own private copy of OpenSSL 1.0.2. Unlike previous releases, the deprecated Apple-supplied OpenSSL libraries are no longer used. This also means that the trust certificates in system and user keychains managed by the Keychain Access application and the security command line utility are no longer used as defaults by the Python ssl module. For 3.6.0, a sample command script is included in /Applications/Python 3.6 to install a curated bundle of default root certificates from the third-party certifi package (https://pypi.python.org/pypi/certifi). If you choose to use certifi, you should consider subscribing to the project’s email update service to be notified when the certificate bundle is updated.

The bundled pip included with the Python 3.6 installer has its own default certificate store for verifying download connections.

Pip installs From github

Installing a package from github is fairly simple. The following are examples of installing packages from github.

Pip installs from github

ZSH

1

2

3

4

5

6

7

8

# Install repo package from the url, master branch

pip3 install git+https://github.com/username/repo.git

@Install repo from the specified branch

pip3 install git+https://github.com/username/repo.git@branch

# Install repo from the specified commit.

pip3 install git+https://github.com/username/repo.git@commit

requirements.txt

At some point, you will find yourself wanting to list a dependency in the requirements.txt file that resides on github. This is fairly straight forward If you only plan to use it in requirements.txt (not processed for usage in setup.py). Note that I’ve specified the master branch of repo and given it a version id.

Sample requirements.txt

1

https://github.com/username/repo/tarball/master#egg=repo-0.1.1

setup.py and requirements.txt

Some people will process their requirements.txt files to generate the install_requires parameter for the setup function called in setup.py. This works fine until you have a repository on github. Setup will fail to find your dependencies if your requirements.txt has a line like the one above. To remedy this we must do two things.

Parse the line to create a named python dependency for install requires. Given the file above, install_requires would equal
["repo==0.0.1"] .

Specify the dependency_links argument to setup.
["https://github.com/chaddotson/repo/tarball/master#egg=repo-0.1.1"] for this example.

An Example

This is an example of a setup.py that properly processes requirements.txt dependencies that are located on github. It’s probably not complete, but it works for what I need. Feel free to take and adapt. See the repo here (python3).

Python

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

fromsetuptools importsetup

version='0.1.0'

temp_install_reqs=[]

install_reqs=[]

dependency_links=[]

withopen("requirements.txt","r")asf:

temp_install_reqs=list(map(str.strip,f.readlines()))

forreq intemp_install_reqs:

# This if should be expanded with all the other possibilities that can exist. However, this