Unevenly Weighted Random Numbers

A couple of weeks ago, I wrote about weighted random numbers. After implementation and some experimentation, I settled on a versatile function that incorporates all the fetchers of the weighting system. Mathematically, it's a little ugly because there is an “if” statement and we end up with a piecewise function.

Where m is the minimum value, M is the maximum value, c is the center point (m ≤ c ≤ M), S is the concentration coefficient (useful range 1 ≤ S < ∞), and α and β are random numbers between 0 and 1. The core of this function is the weighting.

This has been scaled so the output is in a given range.

Here, m ≤ ws ≤ M were as 0 ≤ w ≤ 1. From here, the body of the function is split before and after the center point. For this we require a second random number, α. This value is used to determine of the value is to the left or right of center, and the min and max of the function are adjusted accordingly.

Min (m) Max (M) Center (c) Concentration (S)

The top graph shows the distribution of 1,000 samples, and the lower graph shows a histogram of the distribution. The average is calculated over all the samples. If the center value is half-way between min and max, the average should be the center value (or close to). The center value reflects the highest peak value in the histogram, which should always be close to the specified center.

There are some things you can do with this function that are not meaningful. Having a center value outside the min and max value will still generate values, but probably not useful for anything.

You can also use a concentration coefficient less than one and greater than zero (0 ≤ S < ≤ 1) . This has the effect of pushing the concentration away from the center point and toward the min and max values—basically the acting in the reverse of the normal algorithm. This may be useful for generating a value that is usually either one value or an other, with very little in between.

Here the min is 0, max is 100, center is 20, and the concentration coefficient is 0.1. Notice how the center point is the least populated area of the graph.

There are some ways to use this function to generate some of the other weighted functions. For example, let c = ½ (M – m) + m. This will make the function have equal distribution on both sides of the center point.

Here, the function C is a centered function, c is the center point, and s is the span that can be deviated from the center.

For a simple left or right weighted version of the function, simply set the center point to the min value (left weighted) or max value (right weighted).

Using a concentration coefficient of one (S =1) results in just random uniform random data (assuming β is random). Small values of S are harder to notice in this demo, but become pronounced when more samples are used.

Here is an example of a center at 70, min of 0, max of 100, and a concentration coefficient of 2. At 1000 samples it is not apparent there is any concentration, but at 100,000 samples it is easier to see. The higher sample set also makes the histogram more clear. Notice how the histogram falls to around 100 on both sides, but more rapidly to the right of center. This is necessary because of the uneven weight. So a 0 or a 100 are both equally likely (or unlikely as the case may be), but a 60 and 80, despite being equal distance from the center point are not both as likely as one an other (higher likelihood of 60 over 80).

Congratulations Tazz

Weighted Random Number

I've written articles about weighted random number in the past, but today I ran into a use I've been meaning to explain for a long time.

For example when rolling two dice, the mostly likely number to roll is 7. With 4 dice, it's 14. These are weighted rolls in the context of this article as the likely outcomes are not evenly distributed, but tend toward some center point.

Where α1 and α2 are random numbers between 0 and 1, and S is the “scatter coefficient”. The root of this function is the banding part.

This weights the roll toward 0. The larger the value of S, strong the pull toward 1. Using two of these functions together give a range the function a peak centered at 0 that goes both positive and genitive. Note that the last part of the function normalizes the output so it is between 0 and 1. The process will be explained in a bit, but this function will be called nb(S). So in parts, the full function is:

This function can be simplified if the square root is removed. The root makes the curve more gradual, but this isn't needed.

The trick to this function is the use of the -1, +1 in the denominator. This allows the scatter coefficient to have a defined range between negative infinity and positive infinity (i.e. -∞ < S < +∞), although the useful range is 0 ≤ S < +∞.

The normalized function looks like this:

Rebuilding the center-weighted function results in:

So g( α1, α2, S ) is our weighted function. α1 and α2 are random numbers between 0 and 1. S is the scatted coefficient 0 ≤ S < ∞. The larger the value of S, the more weighted the output is toward 0.

The graph above shows the histogram for distribution for various scatter values and illustrates how as the scatter coefficient increases, the concentration toward the center increases. Note that this function does not create a bell curve (or normal distribution). Instead it has a sharp point at the center. This means that for larger values of S the likelihood of being away from the center point diminishes very rapidly—much more than it would with a function that has normal distribution. So the function favors the center point more strongly than those producing normal distribution.

Now some of the function's versatility. The function is normally used to generate some range.

Here, M is a scale factor (magnitude) and c is an offset that allows the function to have a range such that -(M + c) < v < (M + c). Now a function can be defined to return a value in a given range with some weight.

Where vmin < w( vmin, vmax, S ) < vmax. The floor function makes sure the values are integer numbers, and can be omitted if real number are desired. The center point will always be half way between vmin and vmax.

This function can be modified slightly to simulate a dice roll. Let n be the number of dice, and s be the number of sides on each die. Then vmin = n, vmax = n * s. The scatter coefficient (S) can be varied, but the distribution will not be identical to that of an actual dice roll.

Here the floor function is required. n < d( n, s, S ) < n*s.

In this histrogram, the difference in distribution can be seen between an actual dice roll (in this case, five 6-sides die) and the simulated function d( n, s, S ) where S = 3. Note they both peak at the same location (between 17 and 18) with roughly the same likelihood for these numbers. However, the chances for rolling a 15 are greater with a true dice roll, and less in the simulated. Likewise, rolling an 8 is less likely with dice, and more likely simulated. Keep in mind that the simulated dice roll can do something an actual dice roll can not: produce fractional results. If the floor function part of d( n, s, S ) is removed, any real number in the range can be returned. So while an exhaustive check for every dice roll is possible, every simulated roll is not. Thus, the graph above used one million samples to produce the simulated histrogram.

There are some additional way the function g( α1, α2, S ) can be used. If an uncentered value desired, the random input can be fixed.

These histrogram show the output of 10,000 samples of the function, where α is a random number (0 ≤ α ≤ 1). Note how in both cases, when S = 1 the distribution is uniform for all values. This is because when S = 1, the weighting function is doing nothing, and the random value α is being returned.

SPAM

Boston

Started getting SPAM to two e-mail addresses from the same group: Dice Stars Casino. They somehow got my last.fm e-mail address as well as my linkedin e-mail address. I use a unique e-mail address for every online service so that when I get SPAM, I know the origin, and I can remove the address. It's strange that the same group got two addresses and started using them within days of each other, but it seems even more strange that large site like linkedin and last.fm both somehow gave up my address. It's possible that somehow my e-mail address alias list was compromised, but that seems rather unlikely. Time to keep eyes open.

1 comment has been made.

From asdf

May 11, 2012 at 5:13 PM

Star Polygons

I was doing some work in Google Sketchup, and started experimenting with star polygons. I was drawing an 8-point star when it dawned on me was more than one configuration that could be used to draw such a star. After a little reading, I discovered the nomenclature on this topic. Using the Schläfli symbol, I discovered what I had normally been drawing when I made stay polygons was of the form {p /%u230Ap / 2%u230B-1}. Schläfli symbol is of the form {p / q}, where p is the number of points in the star, and q is the number of points between connecting lines. For example, an octagon shape is {8 / 1} as it has 8 points, and each line is connected to the very next point. A pentagram is {5 / 2}, having 5 points, and each line is connected to the 2nd closest point to either side. What I had been drawing was a star with the distance between points always %u230Ap / 2%u230B-1, or having the connecting points as far away from one an other as possible for the star.

After learning this, I decided to create a little web application to demonstrate this.

Points.

Points between connections.

Line width.

The demo is done using Scalable Vector Graphics (SVG), with some Javascript used manipulate the image. The math is quite simple. First we get the distance between vertices (points on edge). The distance is in degrees (or radians). For example, an 8-point star is 360º / 8º = 45º degrees between points. To draw an octagon, we simply start a 0º and draw a point to 45º, and then 45º to 90º, ext. In order to be coordinates, we need a distance from the center—the radius—which depends on the height and width of the image. The SVG image uses standard computer coordinate—that is (0,0) is the upper left part of the screen. To convert from the polar coordinates requires first knowing where the center of the view port is located. This is half the width and height of the view port. Polar coordinates typically start with 0º on the right side, but I wanted it like a clock—0º on the top. So the polar to screen conversions are as follows: x = centerX – radius * sin( angle ), y = centerY – radius * cos( angle ).

The only trick comes when drawing star figures. For example, a {6 / 2} star is actually two {3 / 1} stars (the notation is 2 {3 / 1}), and not a single continuous path. For this case, we need only to know how many smaller polygons this figure is made from, and draw each of them offset one point from the previous. For example, a {15 / 6} is the same as 3 {5 / 2}. This means there are 3 pentagrams, each offset 24º (360º / 15 sides). So the first pentagram would be drawn with it's tip at 0º, the second at 24º, ext.

D.C.

Ubuntu 12.04 was released today. After it was, and I managed to get on their website, I started doing a torrent download of, well, all of them. If nothing else, it's a good test of our bandwidth. Our connection has been holding around 5.2 Mbit/sec, peaking out around 5.36 Mbits/sec. I don't even know what speeds our ISP say we should have, but it's nice to give them a workout from time to time.

Why do I need all the flavors of Ubuntu? Well, my main computer is a 64-bit machine, but I have several virtual machines setup, and several other computers that run Ubuntu as their primary OS. So having each of the types (desktop, server, and alternate both i386 and AMD64) will save me a step in the future. Otherwise I always find I need the version I don't have downloaded.

Booting Firefox full screen on system start

Lake Champlain, Vermont

The first question someone might ask is why one would want to have a Linux system that starts Firefox full-screen on boot. A web browser makes a good cross-platform human-machine interface. While there are still issues with web pages looking different in different browsers, these issues have become less and less as browser developers comply to standards. Javascript with AJAX and server-side scripting allow for the developer to create rich applications. So designing a user-interface with a web browser is sometimes a great option. One example would be a controller for home automation. After some back-end scripting to handle calls the server would make to change settings, making an interface is as simple as making a web page. On a low-cost touch screen might server this setup from multiple locations in a house. I've implemented an interface to a large engine controller using Firefox. So there are reasons one might want such a setup.

I want to share something I learned and had wanted to explore for a long time. What is minimum required to have a Linux box boot into Firefox? This was a question I first asked sometime in 2005, when I chopped down a Debian install of Linux to make it small enough to fit on a compact flash to be used on single board computer. I recall I did get the system to fit on a 512 MB compact flash, and the boot setup optimized from power-on to fully running happened in 30 seconds. The next part of the project would have been to get Firefox to start full screen on this setup, but unfortunately never happened as the project was ended before we reached this phase.

I have much more experience with Linux as a text-based server than I do with it as a desktop. DrQue.net has been running on a Linux setup since 2003, but it wasn't until around 2008 I was regularly using Ubuntu on my laptop. How X Windows (X11), the windows manager that sits atop it, and applications that run in the GUI fit together is still rather fuzzy to me. However, I knew that it all starts with X11—so I had to have that. But the windows manager, user log-in screen and all the other things I'm use to seeing on a Ubuntu desktop setup I was less clear about. Turns out for Firefox, you only need X11 and Firefox. No windows manager, no log-in handler, or anything else.

So after install just the bare-bones system, you only need get fetch two additional packages. I added a third, unclutter, so I could turn the mouse off after some timeout period (I'll address that latter).

apt-get install xorg firefox unclutter –no-install-recommends

To start Firefox from the command prompt, first start X11, and then start Firefox:

startx & firefox –display=:0.0

Firefox needs to be told where the display is, but that's the only caveat. To get this to happen on system start up requires a couple of additions. First, add a line to /etc/rc.local

su <user name> -c startx

This will start X11 under some user name. By default, most users are not allowed to start X11. This can be changed be editing /etc/X11/Xwrapper.config and modifying :

allowed_users=anybody

This will let anyone start X11. There is a security risk here as everyone recommends setting this value back, but my system is a single user system with no keyboard/mouse. So I'm not going to be terribly concerned about it.

Now that X11 is setup to start when the system does, it's time to add Firefox to the mix. For this, create ~/.xinitrc with the line

firefox <url>

This will cause Firefox to start with X11 for the specified user, and go to some URL. My setup goes to a local web page so that Firefox begins to view some AJAX web page.

In order to get the setup to run full-screen, you will need a plug in for Firefox called “autohide”. I e-mailed the developer of this plug-in to thank him for his work, and he stated that he no longer does anything with this project. So while it works now, it isn't going to be maintained. The developer also states it was only tested on a windows-based machine, but it does work fine under Linux. What this plug-in will allow is the command line option “-fullscreen” which will cause Firefox to start full screen.

In my setup I found the X11 places a resize triangle in the bottom right corner, even when Firefox is full-screen. Since I want nothing on the screen but the content of my web page, I was not pleased with this artifact. While I didn't find a way to remove it, I did find a workaround. On the command line, you can specify the height and width of the window. Making the height 24 pixels longer than the screen resolution places the triangle out of view.

The last item I had to change was the mouse cursor. My system doesn't normally have a mouse connected, so having the cursor on the screen is rather moot. The package “unclutter” can take care of this. One of the parameters is a delay for how long of a pause to allow before turning off the mouse cursor. Setting this delay time to 1 second makes the mouse go away quickly on start. So the ~/.xinitrc becomes something like this:

unclutter -idle 1 &

firefox -height 1048 -width 1280 -fullscreen 127.0.0.1

Where 1024x1280 is the screen resolution for my system.

The last item, and one I found really frustrating to get functional is disabling the screen saver. By default after 10 minutes of no keyboard/mouse activity X11 blanks the screen. Since there is no keyboard or mouse on my setup, this is an issue. I found no better way to stop the screen saver than by doing this in the X11 configuration. I created the file /etc/X11/xorg.conf and added the following lines:

Section "ServerLayout"

Identifier "Default Layout"

Option "BlankTime" "0"

Option "StandbyTime" "0"

Option "SuspendTime" "0"

Option "OffTime" "0"

EndSection

Of all the methods to disable screen blanking I tried, this was the only one that worked. Other methods involved using the command xset, but they didn't work for me.

Javascript web application with elected master

My project recently had a unique requirement. The setup is like this: one configuration script, and one or more views. The configuration script is used to adjust settings that multiple views receive, and this works through an AJAX setup. This is accomplished by a very simple server-side script that simply writes all the parameters it was passed into a file. The views all regularly load this file and adjust their content accordingly. So far, nothing too complicated—almost a text-book AJAX application.

Where is gets complicated is in the fact there are several timers, and the timer counts are javascript controlled. Feedback to the control script was needed so the timers could be monitored. That's easy until you have more than one view. Each view has it's own set of timers, and they would all be fighting to write their data back to the server. This creates a mess of chase conditions. What is needed is one dedicated time master view. This time master will be the only view to return it's clock data to the server. All the other views will be passive observers, basing their clocks on this master clock.

So how it is it decided which view is the time master? We could force the user to pick a master, but what would be better is if this were automatically negotiated.

The first thing to think about is the configuration script. It needs to see that time data is being regularly updated from someone—anyone. The time data may not always be changing—clocks could be stopped. So each time the master view writes data back to the server, it also writes a unique time stamp. This time stamp is simply a random number. The configuration script is periodically reloading the time data, and so as long as this time stamp is changing, the configuration script knows the data is being modified by some remote view. If the time stamp is unchanging, either there is no time master, or there are no open views. Since at least one view is required to get time feedback, the script will assume that there is no time master assigned.

When it is determined there is no time master, the configuration script will signal that a time master is needed. This is accomplished by using a the time master ID. This value will hold some identifying value of the current time master. When a new time master is needed, the value will be set to some token value (I used -1 as this value). The view scripts will see the master ID as part of their periodically updated data, and when the value is seen as the master needed token, they will generate a master ID. This is just some unique value (again, a random number). Once generated, they will write their clock data back to the server, along with their master ID. Now it is a chase condition, but one in which we don't care who wins. The server will prevent two scripts from writing data to a file at the same time. So at some point, the configuration script will load the clock data, and the last view to have written the clock data will have it's data on the server.

The configuration script will see the change in time stamp, and a time master ID to be used. It will latch this ID and write it's own data back to the server. As a time master is no longer needed, subsequent changes to the time master ID from other views will be ignored. Each of the views will then read the configuration data and see this new time master ID. Only one view has this ID, and it will continue to push it's time data to the server. The remaining views will see that the time master ID does not match the one they generated, see the ID is no longer the master request token, and revert to being observers.

In this way, the views elected from themselves which will be the time master. Should the master view be closed updates to the time data will stop being made. The configuration script will see these changes are no longer taking place, and request a new master from the remaining pool of views.

One more mode of operation was added: a self-elected time master. This is setup for times when there is a “main view” that is always desired to be the time master. For this, an other time master ID token is used (I used 1 for this value). When a self-elected master view comes online, it always writes it's clock data. When the configuration script sees this self-elected token ID, it will switch the time master ID to this token value. The acting time master (if there was one) will yield and become an observer automatically just as if it's request for being time master was ignored.

This setup has some short comings. Two self-elected time masters will fight with one an other and create the original problem. The falling out of the time master will cause the clocks to have a delay caused by the clock values not being updated until a new time master is elected. And

Ubuntu web-based display overview

I've been working on a project for my old boss. Back in the day I developed a HMI that was web-based, and decided to go the same route with this project. While Java was suppose to be the cross-platform future of the computer world, it seems Javascript and HTML have actually done it. Just about every device has a web browser, from PCs to smart phones.

Web-based applications need a server. Initially I looked into making a wireless router a server. They are cheap, and the application's traffic was low. However, the application needed some device to plug into a master display, and routers don't have video output. My boss found Foxconn Netbox. Based on an Intel Atom processor, the computer has HDMI video output, can boot form a SD card, and wireless networking.

The requirements for the software setup were not much: Apache web server with PHP, Firefox running full-screen, and ad-hoc wireless networking. My requirements were basically to make an embedded Linux build, which I've done in the past with Debian. I've been using Ubuntu (or variants) for my OS of choice lately, and I thought I'd continue this trend. This is where my adventures began.

Ubuntu 11.10 desktop installed easily on the Foxconn system. I used a 4 GB USB drive as the install drive, and installed to an 8 GB SD card. To create the USB boot image, I used software from pendrivelinux. The Apache/PHP stack was easy enough, and Firefox comes installed. Getting the ad-hoc wireless network setup turned out to be a problem. After a lot of reading and playing around, I found some documentation that explained that under Ubuntu 11.10, IPv6 causes problems for ad-hoc networking. Once IPv6 was disabled, things started to work. After getting this setup mostly functional, I messed something up trying to completely disable the screen saver/blank (which is a lot harder to do then it should be). Some package I installed caused GRUB to wait at the kernel selection screen on boot, which is no good because the computer will not have a keyboard connected in the finial setup.

The IPv6 bug doesn't happen in 11.04, so I decided to go that route for my next round. In addition, I wanted a smaller setup. While I had an 8 GB SD card, I didn't want to use that much space. I started with Ubuntu server, and with nothing else installed. Since I just wanted Firefox, I was hoping I could get away without having a windows manager—and I turned out to be correct. The only packages I needed to select were Xorg and Firefox. Modifying ~/.xinitrc let me start Firefox when X-windows launched. The old but still very functional Autohide extension let me start Firefox full-screen.

Ad-hoc networking was harder than when I had a user interface to set it up. I toiled for some time trying to get it to work before I found this script. It was exactly what I needed. Basically, all I wanted as an ad-hoc wireless network with no password to provide access to the local computer. This script does connection sharing, which is just an added bonus—most of the time, the Ethernet port will not be plugged in. But connecting computers need to be assigned an IP address and get have the domain name of the local computer in DNS. This way, someone connecting can simply enter the computer name in their web browser and have access to the interface.

After everything was working, it was time to put the install on a diet. I began to delete everything I didn't think was required to make the system run. I trimmed the 2.1 GB install down to around 700 MB. This I figured would be small enough to package up and send to my boss. To package the system in Linux, I can just use “dd” to copy the image from the SD device to an other computer via SSH. But my boss doesn't run Linux, so he would need a windows solution. I found a windows program that makes USB drive images, and that would work.

One problem with a disk image is it is the same size as the disk—in this case, 8 GB. Only using 700 MB means a lot of empty space, so it makes sense to compress the data. There is a cavort though. Just because you erase something doesn't make it go away on the disk. So in order to compress the image, you have to clear the free space first. I found the “zerofree” package, which flushes all the free space on a drive. This allowed the 8 GB image to compress to around 250 MB.

I could (and should) write in detail about several parts of this project, but this is good for an overview.

Algebraic Curve with Curvature

Now, some actual application of what I learned in calculus 3. The previous demo was on curve fitting algebraic curves. Now that this is done, it's time to add the tangent, normal, and curvature circle. To start, we will need the tangent line of this function. Note that the function is now two functions, one for the x-coordinate, and one of the y-coordinate. This can be expressed using a vector function, f( t ) = ( X( t ), Y( t ) ). To get the tangent line, we still need a rise over run based on the derivative at some point a. Well, the rise is based solely on Y( t ), and the run solely on X( t ). So the resulting slope then becomes Y '( a ) / X '( a ). The tangent line function itself is y = [ Y '( a ) / X '( a ) ][ x - X( a ) ] + Y( a ). The normal line is again just a 90-degree rotation of the tangent line, easily obtained by taking the negative inverse of the slope. The curvature now requires both the x and y components of the function, and the full function is κ( t ) = [ X '( t ) Y ''( t ) - Y '( t ) X ''( t ) ] / [ X '( t )2 + Y '( t )2 ]2/3. It looks a little messy, but the reality is the function just involves the first and second derivatives—it's not that bad.

The scroll bar selects the location to place the tangent line, normal, and curvature circle. Since the curve is no longer a function and can double back on itself, I could no longer use the floating bar above the image. If you adjust the curve, you may notice the curvature circle is sometimes on the outside of the function, looping in the wrong direction. This is one irritating item I've been trying to fix. It seems from the fact the normal does not always point inward, and I have thus far been unsuccessful in figuring out how to make this happen. It does, however, give me something else to work on.