Paraprogramming Dispatches


Vibecoding a Cyberpunk 3D GUI for System Commissioning


AI
Eugene Zaikonnikov

Frontend development is terrifying. I know a number of superb frontend developers and admire their work but the ecosystem, the churn, and the focus around human end user experience are all positively intimidating. So ideally for an occasional UI project we would just hire consultants. Sometimes though the stars alinged so I was pressed to do Web UIs but they all were minor specialty projects.

Our latest product is Evacsound. Am rather proud of programming I’ve done on it but there wasn’t much to be done in terms of UI. Such products customarily integrate into customer’s existing SCADA (Supervisory Control and Data Acquisition) systems. These usually are built using specialist solutions with decades of history. They are informative and robust but not necessarily very ergonomic or visually exciting. Either way they are very Not Our Problem area here at Norphonic.

However no product is truly complete until it’s free from developer intervention in its entire life cycle. For Evacsound the remaining sore spot was project commissioning. Tunnels have different layouts and even topology: two of our latest deliveries have 6 underground roundabouts between them. Technical rooms, escape tunnels and ramps are also common. So you can’t really provide these layouts for customer projects out of the box. This phase so far was done by us using some auxiliary programs and SQL queries. They did not abstract internal layout representation sufficiently so an outsider could use it. What we needed was an end-user commissioning tool to allow site definition by unaffiliated automation engineers. That would make things simpler (and cheaper!) for everyone involved. A vendor-independent commissioning procedure also greatly simplifies marketing the system internationally.

So you see where it’s going. As the design was depending on minute knowledge of Evacsound operating principles, onboarding a consultant would have taken quite a bit of time, and we’d still have to follow up tightly in development. Hence this became the first production project where I decided to give agents a shot, not least inspired by this banger by tptacek. My rough estimate for completing this project by myself was over 4 months, including 6-7 weeks for learning Javascript, three.js and up to date frontend practices. Using an agent I completed it in 5 weeks worth of intermittent dev time between other projects. This amounted to 270 commits, of which maybe 20 were my manual interventions. The agent was OpenAI Codex, used both via Web UI/GitHub integration and in console. I believe an experienced front-end developer could complete this AI aided somewhat sooner. However a substantial part of work was exploring the conceptual UX space and adopting backend (also by me, manually) for the best fit: this can’t be readily optimized away.

The resulting code as far as I can judge is somewhat mediocre but not terrible. There is a panel of tests for most of functionality and a number of generated documents.

As this was a greenfield internal project there was no customer with strong aesthetic preferences to contend with. Tunnel designs by nature are very spatial so I leaned into late century three dimensional naïve wire-frame look. No hidden line elimination and texturing here is both an aesthetic choice and a visual aid to make sense of scene contents. With color palette it was settled on bold green for structure and controls, red for selections and yellow for auxiliary elements. Neon blue is reserved for elements under automatic detection and distance calibration but this flow is not represented in this demo.

Our basic building blocks are the following connexion elements:

  • Nodes. These correspond to installed Evacsound devices and are depicted as pyramids on tunnel profile sections. Its properties are unique ID (three last MAC octets) and distance to the next node.
  • Zone markers. These partition the tunnel into subdivisions that can be orchestrated by the system separately. For instance the operator can run a public announce only in select zones.
  • Joints. They connect the major extent with a sub-extent on either side. It’s egress property specifies if it’s an evacuation exit, representing the different function of the first node there with a rectangular segment. The V chevron on a joint points in proper left to right direction of the extent.

There are also visual aids:

  • Spans, the line along an extent of nodes labeled with distances.
  • Spacers, seen as zigzags between connexions or chevrons on the sides of joints. Selecting these allows you to add connexion elements via transient buttons. Their other function is to represent the contraction of tunnel length: plotting the tunnel to scale would have made for extremely long walks in a very sparse model. Spacers expand and contract in relation to the distance between the nodes.

You can try it on the demo below, or as a stand-alone version here. Use mouse to orbit and select elements. Zoom wheel or gesture advances the camera. The elements can be added on zigzag spacers using the buttons in the upper row or by pressing Enter if you want a node. They can be deleted with Del or Backspace. Naturally it is not connected to any backend and all communication logic is omitted. You can also fly to the node you want by entering a part of its UID in the search field on the top.

So this was an interesting experience. There’s some debate how true the metrics are but I honestly estimate the AI allowed me to move much faster than I could otherwise. What does that spell for our trade, job prospects, junior pipeline and enjoyment of our craft is perhaps best left for a separate post. For now let me summarize my practical AI process experience thus far into Eugene’s Eightfold Path:

  1. Use an agent. You’re not going to build anything substantial by copying and pasting code to/from the chat window. Before you say ‘oh but it works it’s the AI who can’t handle the caliber of my work’ no just shut up and try a coding agent. There are offerings by all the big players to get you started easily. Any of them will be better than copypasting simply because they provide the embodiment for code within the system/toolchain which helps eliminate hallucination issues.

  2. Have some programming experience. Contrary to many enthusiastic reports you totally need it. Perhaps that would change one day but at the moment I can’t see how. To take advantage of LLMs in a problem domain you need to have substantial expertise in the domain, that’s just how it works. You should know what is simple and what is hard to do: this gives you a better chance formulating the requests in terms of what’s possible. You want to recognize when the agent misunderstood you or made a poor design choice. You need some sense of taste to code as ugly solutions give the model gradient descent into intractable, unfixable mess down the road.

  3. It helps if you’re a half decent writer. Frankly a double major in English (or any other language) lit & CS could be the best skill set for leveraging this technology. LLM appreciates focused description, metaphors, relevant detail and precision use of adjectives. All this helps to map your wants into its thousands-dimensional conceptual space. Anxious about that great novel you never finished or even started? Well here’s the place to flex your literary muscle.

  4. Perform one simple request at a time. Do not combine even related tasks; don’t submit multi-step stacks of work in one prompt. On each your request LLM essentially runs an optimization job, and multi-parameter optimizations are hard to do well. Sequence everything into separate requests. Chew each one down with the best description you can come up.

  5. Steer the agent with strategic design choices. This is directly related to the second point. If you know what an internal data structure or algorithm would be a perfect match for the problem start with that. By nature of iterative development the agent will come up with the simplest structure to solve the immediate request. Long term it would become outdated and what the agent is likely to do is to put more graft and patches atop of that. Since you presumably have a further horizon insist on sensible design choices from the start. You should help solidify the foundation so that LLM has easier time putting meat on the bones.

  6. Separate functionality into different files each few hundred lines long. How long is too long will change as available compute grows but that’s the idea. It simplifies reasoning for the model by reducing the context.

  7. Add basic documentation and make the agent follow it up and update as necessary. I went for a bird eye view design document plus specific documents for each subsystem. It has a two-fold benefit of helping keep the LLM grounded and providing readable summary diffs for you after each iteration.

  8. Use the technology you understand for incomprehensible things will hamper your participation. In my case it was eschewing fancy frameworks for plainest JS possible. Use also the technology that has most training corpus for LLMs. This really means Javascript, Go or Python and their most popular libraries. Stuck with a C++03 codebase? Well sucks to be you: you were warned it’s a terrible choice two decades ago anyway. We can only hope the technology will catch up for other languages but it’s clear the corps won’t be getting a firehose of fresh StackOverflow data anymore.

The Marketing Megabyte


Rant
Eugene Zaikonnikov

“The unified memory architecture of M3 Ultra […] can be configured up to 512GB, or over half a terabyte.” — Apple Inc.

It was the early 1990s. Hard drive manufacturers fought hard to come up with ever larger devices, pushing to the coveted gigabyte boundary. At some point they got enough platter density in consumer grade devices to haul over a billion bytes. There was one problem: the industry convention was that kilo-, mega-, and gigabytes were all power of two units. That made them some 74 million bytes short of the goal.

Not that it stopped them from advertising their disks as gigabyte devices. Thus the Marketing Megabyte was born.

Engineers were understandably pissed, and consumers eventually started to notice that storing a gigabyte of data as reported by their operating systems on these disks is simply not possible. This culminated in a tangible threat of class action lawsuits. In response the marketers instead of fixing their units made an effort of redefining the units themselves. This was successfully pushed through International Electrotechnical Commission in 1999. However the threat of lawsuits lingered until 2007 when this nonsense was finally adopted by national standard bodies. Marketing Megabyte become the Government Megabyte and engineers were told to walk off to substitution units instead.

So let me sum it up here for the next time this issue inevitably flares on the usual tech aggregator websites:

  • Raison d’être for the change was the compulsion to lie to the customers without liability.

  • Greek-derived prefixes kilo- and mega- refer to sharp thousand and million respectively. How important it was after decades of established use in the industry however is unclear. To a non-technical user it makes no difference as long as the unit is used consistently. To a computer engineer, Marketing Megabytes are largely unusable. Either way, the proposed substitution acronyms kibi-, mebi- and so on in culmination of circular reasoning still use the same Greek prefixes.

  • Binary computers address memory in powers of two. This makes the power of two KB/MB/GB a natural measurement unit. To the day it’s impossible to buy a memory chip with capacity in round number of Marketing Megabytes because it simply doesn’t make sense.

  • Marketing Megabyte, while standardized via IEC is still not a SI unit. Which makes sense since byte is not derivable from fundamental units hence International Bureau of Standards has no business managing it. Marketing Megabyte proponents often try to muddy this point by insisting kilo- is a SI prefix (it’s Greek really, see above) but kilobyte is still not a standard SI unit.

  • Kibi-, mebi-, and gibi- prefixes sounds patently silly. I have lost count of engineers who either chose to ignore them or can’t say them aloud without embarrassed smile. They could make great cartoon character names though.

tl;dr beating a dead horse

Breaking the Kernighan's Law


Lisp
Eugene Zaikonnikov

“Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it..” — Brian W. Kernighan.

I’m a sucker for sage advice much as anyone else, and Kernighan is certainly right on money in the epigraph. Alas there comes a time in programmer’s career when you just end up there despite the warning. It could be that you were indeed too clever for your own good, or maybe the code isn’t quite yours anymore after each of your colleague’s take on it over the years. Or just sometimes, the problem is indeed so hard that it strains your capacity as a coder.

It would usually start with a reasonable idea made into first iteration code. The solution looks fundamentally sound but then as you explore the problem space further it begins to seep nuance, either as manifestation of some real world complexity or your lack of foresight. When I run into this my first instinct is to instrument the code. If the problem is formidable you got to respect it: flailing around blindly modifying things or ugh, doing a rewrite at this stage is almost guaranteed to be a waste of time. It helps to find a promising spot, chisel it, gain a foothold in the problem, and repeat until you crack it. Comfortable debugging tools here can really help to erode the original Kernighan coefficient from 2 to maybe 1.6 or 1.4 where you can still have a chance.

Lisp users are fortunate with the options of interactive debugging, and one facility I reach often for is the plain BREAK. It’s easy enough to wrap it into a conditional for particular matches you want to debug. However sometimes you want it to trigger after a particular sequence of events across different positions in code has taken place. While still doable it quickly becomes cumbersome and this state machine starts to occupy too much mental space which is already scarce. So one day, partly as a displacement activity from being intimidated by a Really Hard Problem I wrote down my debugging patterns as a handful of macros.

Enter BRAKE. Its features reflect my personal preferences so are not necessarily your cup of tea but it could be a starting point to explore in this direction. Things it can do:

  • act as a simple BREAK with no arguments (duh)
  • wrap an s-expression, passing through its values upon continuing
  • trigger sequentially based on the specified position for a common tag
  • allow for marks that don’t trigger the break but mark the position as reached
  • provide conditional versions for the expressions above
  • print traces of tagged breakpoints/marks

If you compile functions with debug on you hopefully should be able to see the wrapped sexpr’s result values.

(use-package '(brake))

(defun fizzbuzz ()
  (loop for n from 100 downto 0
	for fizz = (zerop (mod n 3))
	for buzz = (zerop (mod n 5)) do
	(format t "~a "
		(if (not (or fizz buzz))
		    (format nil "~d" n)
		  (brake-when (= n 0)
			      (concatenate 'string
					   (if fizz "Fizz" "")
					   (if buzz "Buzz" "")))))))

These macros try to detect common cases for tagged sequences being either aborted via break or completed to the last step, resetting them after to the initial state. However it is possible for a sequence to end up “abandoned”, which can be cleaned up by a manual command.

Say in the example below we want to break when the two first branches were triggered in a specific order. The sequence of 1, 3, 4 will reinitialize once the state 4 is reached, allowing to trigger continuously. At the same time if we blow our stack it should reset to initial when aborting.

(defun ack (m n)
  (cond ((zerop m) (mark :ack 3 (1+ n)))
        ((zerop n) (mark :ack 1 (ack (1- m) 1)))
        (t (brake :ack 4 (ack (1- m) (ack m (1- n)))))))

In addition there are a few utility functions to report on the state of brakepoints, enable or disable brakes based on tags and turn tracing on or off. Tracing isn’t meant to replace the semantics of TRACE but to provide a souped up version of debug by print statements everyone loves.

CL-USER> (report-brakes)
Tag :M is DISABLED, traced, with 3 defined steps, current state is initial
Tag :F is DISABLED with 2 defined steps, current state is 0
Tag :ACK is ENABLED with 3 defined steps, current state is initial

Disabling breakpoints without recompilation is really handy and something I find using all the time. The ability to wrap a sexpr was often sorely missed when using BREAK in constructs without implicit body.

Sequencing across threads is sketchy as the code isn’t guarded but in many cases it can work, and the appeal of it in debugging races is clear. One of those days I hope to make it more robust while avoiding potential deadlocks but it isn’t there yet. Where it already shines tho is in debugging complex iterations, mutually recursive functions and state machines.

The World's Loudest Lisp Program to the Rescue


Lisp
Eugene Zaikonnikov

It is interesting that while I think of myself as a generalist developer the vast portion of my career has been towards embedded and systems programming. I’m firmly a Common Lisp guy at heart but embedded tech landscape is the entrenched realm of C sprinkled with some C++ and nowadays Rust. However I had incredible fortune to work for the last few years on a substantial embedded system project in Common Lisp.

The story starts in Western Norway, the world capital of tunnels with over 650 located in the area. Tunnels are equipped and maintained to high standard and accidents are infrequent but by the nature of quantities serious ones get to happen. The worst of these are naturally fires, which are notoriously dangerous. Consider that many of single bore tunnels have length over 5km (and up to 24km). Some of them are undersea tunnels in the fjords with inclination of up to 10 degrees. There are no automatic firefighting facilities. These are costly both in installation and maintenance, and while they might work in a country with one or two tunnels total they simply do not scale up. Hence the policy follows the self evacuation principle: you’re on your own to help yourself and others to egress, hopefully managing to follow the signage and lights before the smoke sets in and pray the extractor fans do their job.

Aftermath of a fire

So far Norway have been spared of mass casualty tunnel fires but there have been multiple close calls. One of particularly unlucky ones, the 11.5km long Gudvangatunnelen had experienced three major fires in span of a few years. Thus national Road Administration put forth a challenge to develop a system to augment self-assisted evacuation. Norphonic, my employer, had won in a competition of nine contenders on the merits of our pre-existing R&D work. In late 2019 the project has officially started, and despite the setbacks of the pandemic concluded in 2021 with series production of the system now known as Evacsound. The whole development on this project was done by a lean team of:

  • software engineer who could also do some mechanical design and basic electronics
  • electrical engineer who could also code
  • two project engineers, dealing with product feasibility w.r.t. regulation and practices, taking care of SCADA integration and countless practicalities of automation systems for tunnels
  • project coordinator who communicated the changes, requirements and arranged tests with the Road Administration and our subcontractors
  • logistics specialist ensuring the flow of scores of shipments back and forth on the peak of pandemic

Live hacking Wesley, our EE patching up a prototype live

Atop of this we were also hiring some brilliant MEs and EEs as contractors. In addition two Norway’s leading research institutes handled the science of validating psychoacoustics and simulating fire detection.

At this point the system is already installed or is being installed in 6 tunnels in Norway with another 8 tunnels to some 29km total on order. We certainly do need to step up our international marketing efforts though.

In the tunnels

The Concept

How do you approach a problem like this? The only thing that can be improved under self-evacuation is the flow of information towards people in emergency. This leaves us with eyesight and hearing to work with. Visual aids are greatly more flexible and easy to control. However their huge drawback is their usefulness expires quickly once the smoke sets in.

Sound is more persistent, although there are numerous challenges to using it in the tunnels:

  • The background noise from smoke extraction fans can be very high, and if you go for speech the threshold for intelligibility has to be at least 10dB over the noise floor
  • Public announcement messages alone are not very efficient. They are great in the early phase of fire to give heads up to evacuate, but kind of useless once the visibility is limited. At that point you also know you are in trouble already.
  • Speech announcements rely on comprehension of the language. In one of Gudvangatunnelen fires a bus full of foreign tourists who spoke neither English nor Norwegian had been caught in the thick of it. Fortunately a local lorry driver stopped by to collect them.
  • Acoustic environment in tunnels ranges from poor to terrible. Echo of 4-5 seconds in mid-range frequencies is rather typical.

In addition to above, the system should have still provided visual clues and allow for distributed temperature sensing for fire detection. It has also to withstand pressure wash along the tunnel wall, necessitating IP69 approval. On a tangent IPx9 is 100 bar 80C water jet pressure at 18 cm distance for 3 minutes, so Evacsound is of the most water protected speaker systems in the world.

We decided to start our design from psychoacoustics end and see where it falls for the rest. The primary idea was to evacuate people by aiding with directional sound signals that propagate towards the exits. The mechanism was worked out together with SINTEF research institute who conducted live trials on general population. This method was found effective, with over 90% of tests participants finding the way out based on directional sound aids alone. A combination of sound effect distance requirements and technical restrictions in the tunnel has led us to devices installed at 3m height along the wall at 25m intervals. Which was just as well, since it allowed both for application of acoustic energy in least wasteful, low reverberation manner and provided sensible intervals for radiated heat detection.

Node dissected

A typical installation is a few dozen to several hundred nodes in a single tunnel. Which brings us to the headline: we have projects that easily amount to tens of kilowatts acoustic power in operation, all orchestrated by Lisp code.

Tech Stack

The hardware took nearly 20 design iterations until we reached what I would immodestly call the Platonic design for the problem. We were fortunate to have both mechanical and electronic design expertise from our other products. That allowed us to iterate at an incredible pace. Our software stack has settled on Yocto Linux and Common Lisp. Why CL? That’s what I started our earliest design studies with initially. Deadlines were tight, requirements were fluid, the team was small and I can move in Common Lisp really, really fast. I like to think that am also a competent C programmer but it was clear doing it in C would be many times the effort. And with native compilation there’s no performance handicap to speak of, so it is hard to justify a rewrite later.

Design iterations

Our primary CL implementation is LispWorks. There are some practical reasons for that.

  • Its tree shaker is really good. This allows our binaries to run on a system with 128 Mb RAM with room to spare, which at the scale of thousands devices manufactured helps keep the costs down.
  • It officially supports ARM32 with POSIX threads, something only it and CCL did at the time.
  • The garbage collector is very tunable.
  • There is commercial support available with implementors within the earshot. Not that we ended up using it much but the thought is soothing.

We however do use CCL liberally in development and we employ SBCL/x86 in the tests matrix. Testing across the three implementations has found a few quirks on occasions.

System Design

At its heart Evacsound is a soft real time, distributed system where a central stages time synchronized operation across hundreds of nodes. Its problem domain and operational circumstances add some constraints:

  1. The system shares comms infrastructure with other industrial equipment even though on own VLAN. Network virtualization abstraction breaks down in real time operation: the product has to tolerate load spikes and service degradation caused by other equipment yet be mindful of network traffic it generates.
  2. The operations are completely unmanned. There are no SREs; nobody’s on pager duty for the system. After commissioning there’s typically no network access for vendors to the site anyway. The thing have to sit there on its own and quietly do its job for the next couple decades until the scheduled tunnel renovation.
  3. We have experience designing no-nonsense hardware that lasts: this is how we have repeat business with Siemens, GE and other big players. But with sheer scale of installation you can count on devices going dark over the years. There will be hardware faults, accidents and possible battle attrition from fires. Evacsound has to remain operational despite the damage, allow for redundant centrals and ensure zero configuration maintenance/replacement of the nodes.

The first point has channeled us to using pre-uploaded audio rather than live streaming. This uses the network much more efficiently and helps to eliminate most synchronization issues. Remember that sound has to be timed accounting for propagation distances between the nodes, and 10 millisecond jitter gives you over 3 meters deviation. This may sound acceptable but a STIPA measurement will have no mercy. Then, the command and control structure should be flexible enough for executing elaborate plans involving sound and lighting effects yet tolerate inevitable misfortunes of real life.

The system makes heavy use of CLOS with a smattering of macros in places where it makes a difference. Naturally there’s a lot of moving parts in the product. We’re not going into the details of SCADA interfacing, power and resource scheduling, fire detection, self calibration and node replacement subsystems. The system has also distinct PA mode and two way speech communication using a node as a giant speakerphone: these two also add a bit of complexity. Instead we’re going to have an overview on the bits that make reliable distributed operation possible.

Test of fire detection

Processes

First step in establishing reliability baseline was to come up with abstraction for isolated tasks to be used both on the central and on the nodes. We built it on top of a thread pool, layering on top of it an execution abstraction with start, stop and fault handlers. These tie in to a watchdog monitor process with straightforward decision logic. An Evacsound entity would run a service registry where a service instance would look along these lines:

(register-service site
		  (make-instance 'avc-process :service-tag :avc
				 :closure 'avc-execution
				 :suspend-action 'avc-suspend
				 :resume-action 'avc-resume
				 :process-name "Automatic Volume Control"))

…and the methods that would be able to spin, quit, pause or resume the process based on its service-tag. This helps us ensure that we don’t ever end up with a backtrace or with an essential process quietly knocked out.

Plans

To perform its function Evacsound should be able to centrally plan and distributed execute elaborate tasks. People often argue what a DSL really is (and does it really have to have macros) but in our book if it’s special purpose, composable and is abstracted from implementation details it is one. Our planner is one example. We can create time distributed plans in abstract, we can actualize abstract plans with specific base time for operations, we can segment/concatenate/re-normalize plans in various ways. For instance, below is a glimpse of abstract plan for evacuation generated by the system:

(plan-modulo
 (normalize-plan
  (append (generate-plan (left accident-node)
			 :selector #'select-plain-nodes
			 :time-shift shift
			 :direction :left
			 :orientation :opposite)
 	  (generate-plan (right accident-node)
			 :selector #'select-plain-nodes
			 :time-shift shift
			 :direction :right
			 :orientation :opposite)))
 (* 10 +evacuation-effect-duration+))

We can see above that two plans for each evacuation direction are concatenated then re-normalized in time. The resulting plan is then modulo adjusted in time to run in parallel subdivisions of specified duration.

Generated plans are sets of node ID, effect direction and time delta tuples. They do not have association of commands and absolute times yet, which are the job of ACTUALIZE-PLAN.

Command Language

The central and nodes communicate in terms of CLOS instances of classes comprising the command language. In simplest cases they have just the slots to pass values on for the commands to be executed immediately. However with appropriate mixin they can inherit the properties necessary for precision timing control, allowing the commands to be executed in time synchronized manner across sets of nodes in plans.

It is established wisdom now that multiple inheritance is an anti-pattern, not worth the headache in the long run. However Evacsound make extensive use of it and over the years it worked out just fine. I’m not quite sure what the mechanism is that makes it click. Whether it’s because CLOS doesn’t suffer from diamond problem, or because typical treatment of objects using multiple dispatch methods, or something else it really is a non-issue and is a much better abstraction mechanism than composition.

Communication

The next essential task is communication. Depending on the plan we may communicate with all or subsets of nodes, in particular sequence or simultaneously, synchronously or async, with or without expectation of reported results. For instance we may want to get a noise estimation from microphones for volume control, and that would need to be done for all nodes at once while expecting a result set or reports. A PA message would have to be played synchronized but the result does not really matter. Or a temperature change notice may arrive unprompted to be considered by fire detection algorithm.

This diverse but restricted set of patterns wasn’t particularly well treated by existing frameworks and libraries, so we rolled our own on top of socket library, POSIX threads and condition variables. Our small DSL has two basic constructs, the asynchronous communicate> for outgoing commands and communicate< for expecting the result set, which can be composed as one operation communicate. A system can generate distributed command such as

(communicate (actualize-plan
	      (evacuation-prelude-plan s)
	      'fuse-media-file
	      (:base-time (+ (get-nanosecond-time) #.(2ns 1.8)))
	      :sample-rate 32000
	      :media-name "prelude"))

What happens here is that previously generated plan is actualized with FUSE-MEDIA-FILE command for every entry. That command inherits several timing properties:

  • absolute BASE-TIME set here explicitly
  • DELTA offset which is set from the plan’s pre-calculated time deltas
  • TIME-TO-COMPLETE (implicit here) which specifies expected command duration and is used to calculate composite timeout value for COMMUNICATE

If any network failure occurs, a reply from the node times out or node reports a malfunction an according condition is signaled. This mechanism allows us to effectively partition distributed networked operation failures into cases conveniently guarded by HANDLER-BIND wrappers. For instance, a macro that just logs the faults and continues the operation can be defined simply as:

(defmacro with-guarded-distributed-operation (&body body)
  `(handler-bind ((distributed-operation-failure
		   #'(lambda (c)
		       (log-info "Distibuted opearation issue with condition ~a on ~d node~:p"
				 (condition-name c) (failure-count c))
		       (invoke-restart 'communicate-recover)))
		  (edge-offline
		   #'(lambda (c)
		       (log-info "Failed to command node ~a" (uid c))
		       (invoke-restart 'communicate-send-recover))))
     ,@body))

This wrapper would guard both send and receive communication errors, using the restarts to proceed once the event is logged.

So the bird’s eye view is,

  • we generate the plans using comprehensible, composable, pragmatic constructs
  • we communicate in terms of objects naturally mapped from the problem domain
  • the communication is abstracted away into pseudo-transactional sets of distributed operations with error handling

Altogether it combines into a robust distributed system that is able to thrive in the wild of industrial automation jungle.

TL;DR Helping people escape tunnel fires with Lisp and funny sounds

« Older Posts