Step/Dir signal generator for CNC

I've coded a new propeller based step/dir signal generator that is intended to be used for CNC applications. Instead of the standard technique to use waitcnt and port pin toggles it uses a new "software PLL" algorithm. A counter in NCO mode (A) is programmed to generate the step pulses with the desired frequency. A second counter (B) counts the steps actually output. So PHSB and PHSA represent the whole and fractional part (phase) of the pulse stream and are used as feedback to the control loop. This allows relatively high frequencies (>4MHz) with only moderate software cycle times (~1ms). Multiple axes (one per cog) can be synchronized easily. The main process can feed trajectory data as one vector of coordinates per time slice (cycle time of the PLL loops).

This video shows the first test with only one motor. I plan to add it to the OBEX library. But I have to make some more tests, first, to make sure it works mith multiple axes and meets accurate timing and positioning even with complex trajectories. But if anybody is interested I could post a beta release.

I wonder if it would be better to hide the VAR block inside the StepDirGen object and instanciate multiple objects, one for each axis, in the main module instead of declaring multiple VAR blocks and only one object. It should not make any difference in the memory footprint but we could eliminate the need for passing @position at every method call. For now, I decided to use the public VAR block solution to allow easier debugging.

Comments

your approach is pretty cool. Until now I have used the bresenham algorythm to get exact syncronisation between 2 axles.
The precission of the bresenham algorythm results from the "how its done" itself. This algorythm can't be unprecise.

Can your approach reach a precisiion of 1 step in the syncronisation between mutliple axles.
Do I have to watch the number of steps already created and then stop the pulse-creation or enables your aproach a "set and forget"?

With set and forget I mean setup number of steps frequency give a command start and then ALL cogs can do other things and the
stepcreation stops automatically?

Until now I have used the bresenham algorythm to get exact syncronisation between 2 axles.
The precission of the bresenham algorythm results from the "how its done" itself. This algorythm can't be unprecise.

The bresenham algorithm is precise within one step with regard to position. But it is very imprecise regarding velocity! The generated signal for the slower (interpolated) axis contains a lot of jitter. Let's draw a line from X0 Y0 to X100 Y75. Bresenham generates one X-step per cycle and an average of 3/4 Y-steps per cycle. This means, the Y-steps have a pattern of step - step - pause - step and so on. The time between two steps varies by a factor of two! The motor has to accelerate and decelerate all the time even while running at constant (average) speed. This is very bad because the stepper makes much more noise and vibrations than necessary. Since the resulting acceleration peaks have a broad frequency band and can hit any resonance frequency of the mechanical system it can even make the motor stall in extreme cases.

Can your approach reach a precisiion of 1 step in the syncronisation between mutliple axles.

Even better, it can reach sub step precision. If you take the line in the example above the 3 Y steps will be evenly distributed over the duration of 4 X steps. This is because I not only count pulses but also control phase relationship.

Do I have to watch the number of steps already created and then stop the pulse-creation or enables your aproach a "set and forget"?
With set and forget I mean setup number of steps frequency give a command start and then ALL cogs can do other things and the
stepcreation stops automatically?

No, set and forget requires one additional cog. The current software requires a continous stream of one vector of coordinates per time slice. So moving along a line taking 200ms you need to provide 200 coordinates, one per ms. The time of the acceleration ramp is added to the total time. For example if the ramping takes 0.1s then the total time of the move will be 0.3s. The time from 0.2 to 0.3s is idle so the "main" cog could do something else while waiting for completion. But during the first 0.2s it's busy. The axis cogs generating the step and dir signals are busy all the time and can't do anything else unless motion is stopped completely.

I plan to write higher level software later to provide "set and forget" functionality. The current target application is CNC milling of free-shape contours where I have to provide 1000s of vectors per cut anyway and there are no straight lines in the worst case. "Set and forget" is more useful for point-to-point applications like pick&place where it reduces the amount of data and cpu load enormously, of course.

I've done something similar to this only I don't count the pulses using a counter. Instead I use a loop of constant length and the fact that the number of pulses is this time multiplied by the frequency. That way you can in theory do two axis per cog. I have never got around to doing the synchronisation stuff though. It is on my big list

Yes, this is the standard DDS approach. It is simple as long as you limit yourself to whole pulses per time slice or at least whole numer of pulses between direction changes. If you allow partial pulses then you have to synchronize direction changes somehow to avoid setup/hold problems. In my software the input to the DDS is the output of a moving average filter with possible fractional numbers and rounding errors. I don't say it's impossible, I was just to lazy to design an algorithm that guaranties no fractional errors summing up. With the added "safety loopback counter" it is easy to assert no lost steps. If I run out of cogs it might be worth trying to find a better algorithm that safe by design.

thank you very much for sharing your knowledge about stepper-control. I think I learned some important facts.
You are right for free shape CNC-milling you have to set lots of vectors in sequence anyway. This brings me back to an other idea.
For a CNC-interpreter in another thread somebody suggested to reduce the command-set and let a PC create a sequence of very small vectors for aproximating a circle etc.
Did you do some calculations what feedrates are possible in dependence of the precision?
Coding in SPIN is simpler than PASM (for me) but for gaining speed I would be willing to dive deeper into PASM. And I guess watching a loopback counter wouldn't be that hard.

Are you planning to release code or do you just want to share the basic ideas?

For a CNC-interpreter in another thread somebody suggested to reduce the command-set and let a PC create a sequence of very small vectors for aproximating a circle etc.

That's exacly what I want to do. The propeller is ideal to do the low level, timing critical stuff. But it lacks the memory to store large CNC programs, the graphics for a decent GUI and you need a PC to do CAD and CAM anyway.

In a real time system you have to design for the worst case. If the motion controller can handle free-shape contours it can also handle circles and lines. Some bandwidth is then wasted but it costs nothing.

Did you do some calculations what feedrates are possible in dependence of the precision?

I don't worry about that. The limiting factor is always the mechanical system. With a possible step rate >1MHz you can do 1m/s feedrate at sub µm resolution and the algorithm should have precision better than 90° phase angle or 1/4 step. The worst error is the "sehnenfehler" (error due to linear interpolation between coordinates instead of true circle arcs or splines). But no mechanical system on earth can keep up with this. If you need true precision for machining you have to slow down for a finishing pass, anyway. This automatically reduces the "sehnenfehler".

A possible pitfall is that I currently handle ramping up/down with a moving average filter per axis. This means that if you travel a long poly-line consisting of many small segments the algorithm will "blend" the segmented moves and will generate rounded corners (see explanation of "constant velocity mode" vs. "exact stop mode" in Mach3). This is intended behaviour for my application. But you have to be careful when calculating feed rates. Driving to fast in narrow turns will throw you out of the track, of course. This problem can be very complex for free-shape trajectories and that's why I want to handle it on the PC side. Exact stop mode is also possible, of course, just do a WaitEndMove() after each line segment.

Coding in SPIN is simpler than PASM (for me) but for gaining speed I would be willing to dive deeper into PASM. And I guess watching a loopback counter wouldn't be that hard.

I think PASM is required for the pulse generator part only. Command interpreter and linear and circular interpolation can be done in SPIN. I'll provide a little demo as example. There's a millisecond of time to calculate the next two or three coordinates. That should be enough to even do floating point math in SPIN, I hope.

Are you planning to release code or do you just want to share the basic ideas?

I'm willing to share the PASM code for step generation and some demo code in spin. However, I don't want to make the whole project open source. It will be a commercial product and my customer won't be happy if it was available on the internet for free.

And thanks for the positive feedback. I don't claim my method for generating pulses is better than others, it's just optimized for a special case. For single axis point-to-point movement, the traditional waitcnt/toggle algorithm is simpler and needs only 1 cog/axis for "set and forget" functionality. If you have many axes, say more than 4 or 5, then it might be worth thinking about handling two axes per cog as Graham suggested. I think it should be possible to eliminate the second loopback counter as phase error should be well below one step per time slice even in open loop NCO mode. PHSA could still be used as feedback to a PLL algorithm.

Desiko, I'll try to do the mult-axis test this afternoon. When nothing goes wrong and debugging is done I'll change the VAR block to object-local and the I can release it, I think this weekend.

I still haven't managed to hook the board to a real machine and test precision. But the signals look good and I think I can post the first release. Unfortunatelly, floating point math in pure spin is a bit too slow. I was too lazy to code a well optimized interpolation algorithm for the demo so I just took float32 wasting another cog.

I have been on holidays, and so I have just found your post....It looks great.

I have a gcodecompiler based on the LinuxCNC compiler that runs on a pc and can pre compile a gcode file into 1ms vector moves (Steps per time slice). This should work perfectly with your code. If you would like a copy just let me know and I will send it to you (I may have posted it in another thread - I can't remember). I will take care of all ramping, backlash compensation etc etc .

You will notice that there is a batch file that I have been running - gcodecompiler -ini gcodecompiler.ini m64-m65.ngc > test.txt - Just change the name of the input file and possibly the output file. There is a very small readme file also.

Sorry this is a windows compile.

It should output for you a list of 6axis worth of steps to do within a 1msec time period, taking into account the parameters you have setup within the gcodecompiler.ini file ie steps per unit (mm or inch) backlash etc etc.

My thoughts originally was to precompile the complete gcode file on the pc and then send this file to an sd card for the prop to read, but the problem with this is that if the gcode file is huge in the first place the resulting file will be huge x1000. So I was thinking of starting the precompile process happening at the pc end, and then just send 10sec worth of motion at a time to the sd card or RAM for the prop to work with. Any thoughts on this????

this sounds interesting. I plan to write my own GCode compiler but this will obviously take some time. So it would be nice to have something to play with before I'm finished.*

My thoughts originally was to precompile the complete gcode file on the pc and then send this file to an sd card for the prop to read, but the problem with this is that if the gcode file is huge in the first place the resulting file will be huge x1000. So I was thinking of starting the precompile process happening at the pc end, and then just send 10sec worth of motion at a time to the sd card or RAM for the prop to work with. Any thoughts on this????

Transferring the whole file at once is certainly no good idea. Even if big SD cards have enough memory space to store a file of that size it would take minutes to transfer it. And if you find out that you did something wrong after the first couple of moves you have to start all over again.

There also must be a way to stop a running program if anything goes wrong. The queue of 1ms commands should not be too long or the reaction delay would be too long. One or two seconds would be acceptable, I think. So if you hit a "soft" stop button on the PC the propeller stops after the delay of the buffer running empty and with proper ramping so that no steps are lost and the user is able to continue later. There should also be a "hard" stop (emergency stop) signal going directly to the propeller. If ramping is calculated on the PC then the propeller can do nothing but cutting the step signals immediately resulting in lost steps for stepper motors or fault-out servo controllers due to over-torque (infinite deceleration).

Transferring 6 coordinates every 1ms interval takes at least 12kB/s plus handshaking. This is close to the edge of what can be done with RS232 or USB/VCP.

(* Actually, I already have a working G-code compiler, but it was written in Oberon on an Amiga based system. I have already translated most of it to C++ but providing a user friendly GUI to setup all parameters etc. is a really challanging task)

Hi ManAtWork
This gcodecompiler is from the LinuxCNC code, so it should support all of the LinuxCNC codes.
My thought was to have one prop chip (or some other micro) doing the comms with the pc and then this prop/micro speaks to the Step&Dir prop chip.
There are quite a few things to be concidered here, and hence the reason I am stuck where I am (I can't make up my mind). It needs to have the ability to override the feedrate from 0 to 150-200%, this should be relativly simple by just changing the timeslice. FeedHold/Pause could be done via an override of 0% of feedrate. There are markers in the precompiled code to indicate which line of the original gcode is being executed, so things like restarts shouldn't be to hard.
Another thing I have been pondering is that since the code has been pre-compiled into steps per timeslice, it should be quite simple to run the code backwards if required ie. I do a lot of work on CNC plasma cutters, and it is nice to be able to run the code backwards to the point of a flame-out, and then restart from that point.

I finally have had the time to do the multi-axis test, today. I've added a circle/helix routine to the code. Unfortunatelly, I can't find my camera, so I can't post a video at the moment.:blank:

Desiko, have you already tested something? If nobody has found any bugs, so far, I could add the release to the OBEX library.

Andrew, good idea. I also have plans to implement feed override. Just modifying the length of a timeslice could result in problems near 0% because the slices become too long. But we could do it with some sort of pre-processor that sub-interpolates between subsequent vectors. I still don't have a good idea how to do the >100% part. If you just make the time slices shorter or skip vectors in the pre-processor the precalculated ramps also get shorter meaning probably overloading the drives regarding acceleration. You could compensate for that by only running the machine at half the possible acceleration at 100% but there must be a better solution somehow... If the PC would handle the >100% case there would be a delay due to the FIFO buffer when increasing the feed override above 100% but I think that would be acceptable as long as decreasing it works immediately.

cool a GCode compiler in Oberon! I never thought that somebody would mention this programming-language again.
I haven't used it - but twenty years ago I read something about it. Most people that I talked to abut Oberon said too proprietary to use.

Would you mind sharing the code that you ported to c++?
As you say GCode compiler. Does this compiler read in GCode and output vectorcoordinates in 1 ms timeslices ?

that's a long story. A friend has written an oberon compiler for the Amiga. Oberon was very cool in the late 80s. It was object oriented and very efficient. When I developed my own single board computer I asked him if he would give me the source because the my processor (coldfire) was very similar to the 68k of the Amiga. I only had to change a few instructions to make it generate coldfire code. I also wrote my own operating system, sort of Amiga OS but stripped down to the bare minimum. I liked the Amiga OS because it was kind of realtime capable as opposed to windows. So it was ideal for machine control.

As you say GCode compiler. Does this compiler read in GCode and output vectorcoordinates in 1 ms timeslices ?

Yes it did exactly that, reading G-code and output it in 1ms time slices. Compiler is probably not the correct word, because the output was realtime and it could do also some other things like real time control loops for thread cutting and gear cutting.

Sorry, no. I'll try to share at least the general purpose propeller part. But I'm doing that CNC business for a linving and I can't give away evrything.

I understand this. That's really OK.

But it should't be too hard to find an open source G code processor. EMC² for example is open source and can also output position vectors in adjustable time slices.

good hint

At the school where I'm working as a teacher we have two NC-milling machines. But they only can do one axis at a time.
If I ever will modify them to become real CNC-mills I'm pretty sure to buy
the steppermotors and drivers from your company http://benezan-electronics.de/

Yes, this is the standard DDS approach. It is simple as long as you limit yourself to whole pulses per time slice or at least whole numer of pulses between direction changes. If you allow partial pulses then you have to synchronize direction changes somehow to avoid setup/hold problems. In my software the input to the DDS is the output of a moving average filter with possible fractional numbers and rounding errors. I don't say it's impossible, I was just to lazy to design an algorithm that guaranties no fractional errors summing up. With the added "safety loopback counter" it is easy to assert no lost steps. If I run out of cogs it might be worth trying to find a better algorithm that safe by design.

Sorry I have not been online for a while. Why would you have fractional pulses per time slice?

with fractional pulses the alias effects of the bresenham algorithm can be avoided or at least considerably reduced. Aliasing shows up especially when step frequency is near 50 to 200% of the vector update rate (~1kHz).

The bresenham algorithm is precise within one step with regard to position. But it is very imprecise regarding velocity! The generated signal for the slower (interpolated) axis contains a lot of jitter. Let's draw a line from X0 Y0 to X100 Y75. Bresenham generates one X-step per cycle and an average of 3/4 Y-steps per cycle. This means, the Y-steps have a pattern of step - step - pause - step and so on. The time between two steps varies by a factor of two! The motor has to accelerate and decelerate all the time even while running at constant (average) speed. This is very bad because the stepper makes much more noise and vibrations than necessary. Since the resulting acceleration peaks have a broad frequency band and can hit any resonance frequency of the mechanical system it can even make the motor stall in extreme cases.

I found something interesting that might help. Microkinetics has a free version of there Mill Master Pro and Turn Master Pro software that generate signals for their MN400 USB controller. They list the communications codes here http://www.microkinetics.com/mn400.htm. How about writing a MN400 simulator for the Propeller that generates utilizes its counters to generate the ultra-smooth step/dir signals. It appears that the software handles the gcode and conversion into step pulses and sends this down to the controller. The controller only has to implement 30 commands for full functionallity.

Their software is really nice and their MN400 price isn't too far out of line, but we could improve on the step smoothing with the propeller. Also, we would have a customizable controller. Besides, it would be another use for our toys (propeller and CNC)!

I thought about set-and-forget and came up with this, use one counter to generate pulses on one pin as describe before, use the other counter to count the pulses generated (NEGEDGE detector) and take action at given presets.

One of the biggest problems I see right now for a set-and-forget mode is the accelerate/decelerate calculations for the propeller. You could ramp up the FRQx to the desired frequency for accleration. Then have the pulse counter looking for a precalculated ramp down preset based on the target count minus the counts need to decellerate (here's the tricky math part). You'd need to check for a partial ramp up (move too short to ramp all the way up).

I wish I had time to work on it right now, but I'm in the process of converting my G0516 to CNC. Maybe later I'll dig into it.

good idea. I took a short look at the MN400, both hard and software. I think this thing far more complex than necessary. Backlash? Relative mode? Bah, not needed at the hardware level. All those can be handled by the PC software internally without the external controller even knowing. And look at the board. Have you seen all those ICs and connectors? The propeller needs only less than half the PCB space. And half of the components (see picture) are for protection and debugging.

At the moment I'm implementing my own communication protocol and set of commands (BTW only 12 and very simple). I'll try to convince the developpers of NC-FRS (free CNC milling software) and WinPCNC (low cost CNC software) to adapt their software to support the propeller based controller. In the case I should have no luck someone could still write a driver for the open source EMC² or LinuxCNC software.

That's strange. I thought I had uploaded something to OBEX but I also can't find it anymore. It might got lost when the forum software was updated and the OBEX database re-organised some years ago.

Yes, I have somehow finished that project. I've already sold ~1000 units of the controller shown above. See also last post here: http://forums.parallax.com/discussion/98234/prop-driving-a-stepper-motor/p2
But the current code has grown far to complex for OBEX, I fear no one would understand it (including me, sometimes...) I have to find the originally posted code in my backups.