Several improvements to the image-format-vs-repo-size experiment and the
report documenting it: dropped the first 3 columns of data to make the
bar chart clearer; drawing the bar chart on a transparent BG in case
it's used on a page with a non-white BG, as with a selected Fossil forum
post; added axis labels; added a run time calculation to the expensive
first step; fixed a few syntax problems that prevent the Python code
from compiling on Python 3; documented some problems with running it under
Anaconda on macOS; better documented the notebook's dependencies; many
clarifications to the experimental report text.
check-in: 41e5237a user: wyoung tags: trunk

"source": [
"# Image Format vs Fossil Repository Size\n",
"\n",
"## Prerequisites\n",
"\n",
"This notebook was developed with [JupyterLab][jl]. To follow in my footsteps, install that and the needed Python packages:\n",
"\n",
" $ sudo pip install jupyterlab matplotlib pandas wand\n",
"\n",
"In principle, it should also work with [Anaconda Navigator][an], but because [Wand][wp] is not currently in the Anaconda base package set, you may run into difficulties making it work, as we did on macOS. These problems might not occur on Windows or Linux.\n",
"\n",
"This notebook uses the Python 2 kernel because macOS does not include Python 3, and we don't want to make adding that a prerequisite for those re-running this experiement on their own macOS systems. The code was written with Python 3 syntax changes in mind, but we haven't yet successfully tried it in a Python 3 Jupyter kernel.\n",
"\n",
"[an]: https://www.anaconda.com/distribution/\n",
"[jl]: https://github.com/jupyterlab/\n",
................................................................................
"\n",
"# Merge per-format test data into a single DataFrame without the first\n",
"# first 3 rows: the initial empty repo state (boring) and the repo DB\n",
"# size as it \"settles\" in its first few checkins.\n",
"data = pd.concat(repo_sizes, axis=1).drop(range(3))\n",
"\n",
"mpl.rcParams['figure.figsize'] = (6, 4)\n",
"#plt.rcParams['axes.facecolor'] = 'white'\n",
"ax = data.plot(kind = 'bar', colormap = 'coolwarm',\n",
" grid = False, width = 0.8,\n",
" edgecolor = 'white', linewidth = 2)\n",
"ax.axes.set_xlabel('Checkin index')\n",
"ax.axes.set_ylabel('Repo size (MiB)')\n",
"plt.savefig('image-format-vs-repo-size.svg', transparent=True)\n",
"plt.show()"

"source": [
"# Image Format vs Fossil Repository Size\n",
"\n",
"## Prerequisites\n",
"\n",
"This notebook was developed with [JupyterLab][jl]. To follow in my footsteps, install that and the needed Python packages:\n",
"\n",
" $ pip install jupyterlab matplotlib pandas wand\n",
"\n",
"In principle, it should also work with [Anaconda Navigator][an], but because [Wand][wp] is not currently in the Anaconda base package set, you may run into difficulties making it work, as we did on macOS. These problems might not occur on Windows or Linux.\n",
"\n",
"This notebook uses the Python 2 kernel because macOS does not include Python 3, and we don't want to make adding that a prerequisite for those re-running this experiement on their own macOS systems. The code was written with Python 3 syntax changes in mind, but we haven't yet successfully tried it in a Python 3 Jupyter kernel.\n",
"\n",
"[an]: https://www.anaconda.com/distribution/\n",
"[jl]: https://github.com/jupyterlab/\n",
................................................................................
"\n",
"# Merge per-format test data into a single DataFrame without the first\n",
"# first 3 rows: the initial empty repo state (boring) and the repo DB\n",
"# size as it \"settles\" in its first few checkins.\n",
"data = pd.concat(repo_sizes, axis=1).drop(range(3))\n",
"\n",
"mpl.rcParams['figure.figsize'] = (6, 4)\n",
"ax = data.plot(kind = 'bar', colormap = 'coolwarm',\n",
" grid = False, width = 0.8,\n",
" edgecolor = 'white', linewidth = 2)\n",
"ax.axes.set_xlabel('Checkin index')\n",
"ax.axes.set_ylabel('Repo size (MiB)')\n",
"plt.savefig('image-format-vs-repo-size.svg', transparent=True)\n",
"plt.show()"

[oox]: https://en.wikipedia.org/wiki/Office_Open_XML
[wi]: https://en.wikipedia.org/wiki/Windows_Installer
## Demonstration
The companion [`image-format-vs-repo-size.ipynb` file][nb] is a
[Jupyter][jp] notebook implementing the following experiment:
1. Create an empty Fossil repository; save its initial size.
2. Use [ImageMagick][im] via [Wand][wp] to generate a JPEG file of a
particular size — currently 256 px² — filled with Gaussian noise to
make data compression more difficult than with a solid-color image.
................................................................................
modify the notebook to try different things. Want to see how the
results change with a different image size? Easy, change the `size`
value in the second cell of the notebook. Want to try more image
formats? You can put anything ImageMagick can recognize into the
`formats` list. Want to find the break-even point for images like those
in your own respository? Easily done with a small amount of code.
[im]: https://www.imagemagick.org/
[jp]: https://jupyter.org/
[nb]: ./image-format-vs-repo-size.ipynb
[wp]: http://wand-py.org/
## Results
Running the notebook gives a bar chart something like⁴ this:
![results bar chart](./image-format-vs-repo-size.svg)

[oox]: https://en.wikipedia.org/wiki/Office_Open_XML
[wi]: https://en.wikipedia.org/wiki/Windows_Installer
## Demonstration
The companion `image-format-vs-repo-size.ipynb` file ([download][nbd],
[preview][nbp]) is a [Jupyter][jp] notebook implementing the following
experiment:
1. Create an empty Fossil repository; save its initial size.
2. Use [ImageMagick][im] via [Wand][wp] to generate a JPEG file of a
particular size — currently 256 px² — filled with Gaussian noise to
make data compression more difficult than with a solid-color image.
................................................................................
modify the notebook to try different things. Want to see how the
results change with a different image size? Easy, change the `size`
value in the second cell of the notebook. Want to try more image
formats? You can put anything ImageMagick can recognize into the
`formats` list. Want to find the break-even point for images like those
in your own respository? Easily done with a small amount of code.
[im]: https://www.imagemagick.org/
[jp]: https://jupyter.org/
[nbd]: ./image-format-vs-repo-size.ipynb
[nbp]: https://nbviewer.jupyter.org/urls/fossil-scm.org/fossil/doc/trunk/www/image-format-vs-repo-size.ipynb
[wp]: http://wand-py.org/
## Results
Running the notebook gives a bar chart something like⁴ this:
![results bar chart](./image-format-vs-repo-size.svg)

This page was generated in about
0.006s by
Fossil 2.9 [356c0d017e] 2019-05-23 17:18:56