How Did SimCity Work?

I'm working on a city building web app right now, and I wanted to see how the original SimCity¹ dealt with some of the systems theory problems that are inherent to the genre. Modern city builders run on more simple but much more computationally expensive systems. Cities Skylines, the hegemonic city builder of the last decade, simulates each citizen as an individual agent with various wants and needs. The simulation of each citizen is not particularly complex, and plenty of optimizations and shortcuts are used to make this sustainable, but it remains an incredibly expensive simulation. Cities Skylines is one of the very few games left today which actually require decent hardware to run. As a cost-conscious developer hoping to run this simulation on a cloud platform, this is a problem.

The trend of individualizing citizens started with SimCity 2000, which ran off of a much more simplified model. Citizens were not persistent, and were not representative of the actual population. Instead, they acted as a proxy for the population: would the citizens who live in this area have job opportunities nearby? Are their needs for beauty and entertainment met? Are they suffering from pollution or crime? It served as a first step from the raw mathematical modeling of SimCity to the pachinko machine modeling we see today. Though it was a far cry from the obscene complexity of Cities Skylines, the design philosophy is identical: the easiest and most realistic way to simulate a city is to simulate the people who live in it. It is the Austrian economics of systems theory.

Even so, we can imagine a simulation which works in the opposite direction: we could derive the behavior of citizens from the social and economic conditions of the city. This is how most simulations -- and indeed most games -- choose to structure their systems. A 3D game engine is a simulation of the physical world which uses vector math to calculate what you would see if you were standing in the game world. You could, of course, calculate the position and orientation of every object in the game world, cast a series of rays from each light source, and render the scene through each pixel which reaches the player's eyes. This would be an incredibly accurate simulation of the physical world, but would be so computationally expensive that it would be impossible to run on anything but the most powerful computers. Instead, we use a series of approximations and shortcuts to make the simulation run on a wide variety of hardware. These approximations and shortcuts are not even much of a compromise -- each step towards "realism" (ray tracing, ambient occlusion, etc) in new game engines sees diminishing returns. It's no contest: the mathematical models used to simulate 3D space are pretty much always more useful than an "Austrian" simulation. Of course, these mathematical models are incredibly complex and the result of decades of innovation. Let's imagine a world where computers are a million times more powerful than they are today. Let's also imagine that, in this world, nobody has ever thought to render 3D graphics before, and you are the first person to ever try. In this world, you would probably start by creating the "Austrian" simulation: every object in the world, every light source, every pixel. You would find some good optimizations (only cast rays from nearby light sources, lower resolution for distant objects, etc) and you would probably get an excellent simulation without requiring the decades of innovation that went into modern engines. Sure, the models are still theoretically "better", but the thing we have now is more than "good enough" and is less likely to suffer from bugs and innacurate system behavior.

Let's flip this scenario: what if we lived in a world where computers didn't progress past 1989? Would Cities Skylines still use the Austrian simulation? Almost certainly not. You might be able to strike a balance between the two approaches, as with SimCity 2000, but the easy, pachinko approach to grotesque realism would be impossible. In this world, we would instead have to rely purely on systems theory. We would have to derive models capable of predicting the broad economic and social trends based on the inputs of the simulation. The preffered approach to every problem would be more equational than procedural.

People took this sort of thing very seriously back in the day. The original SimCity, released in 1989, was not looked at with the skepticism that we might have for something like Cities Skylines today. For one, it was a weird game for its time. Video games were about fighting bad guys or racing cars or whatever. They had win states, loss states, and points. SimCity had no win state or loss state. There wasn't even an identifiable character. It looked more like business software than a game. There were pie charts!

In 1990, a local newspaper organized a SimCity competition with all of the Province, RI mayoral candidates. This could never happen today. There could never be widespread conflation of success in Cities Skylines with success in city planning. But this was a different time. Fresh off of the high of the 1980s, having boomed straight through two recessions, with neoconservative economics holding complete political and academic hegemony, entering the 3rd consecutive term of a Republican president, with the Soviet Union sputtering to a close and Milton Friedman doing consulting work in China, the world was different. The idea of a video game that could simulate the economic complexity of a city was not that far fetched. Academics believed that economics was close to being "solved", that we had the tools to roughly predict the future. When I think about the Fukuyamaist "End of History" argument, this mayoral competition is always my first thought. We felt so confident in our ability to predict the future that we gave a video game enough reverence to use in schools and universities. Academic articles were written about educational applications and political biases in the game. It got a piece in the New York Times. More than anything else, all the mayoral candidates in a local election felt that an esports competition was a reasonable stand-in for a debate.

The candidate who won the competition went on to win the election. Is it overreaching to say that he won because of this competition? Maybe, who cares.

Did you know that SimCity is open source now? Crazy!

The license was relinquished 15 years ago in an apparent collaboration with something called the One Laptop Per Child program. The source code is disorganized and filled with quirks. The simulation itself was written in C, but the open sourced code is wrapped with TCL/TK. SimCity was designed so that the logic of the game was always separated from the interface, which allowed it to be ported to, as on British Journal put it, "every known computer format"². They also included a Python-wrapped C++ version of the simulation, but from what I can gather this is was a purely academic exercise. I can't find any evidence that it was ever released in a non-educational context.

In the interest of staying within GCP's free tier, I'm going to see if I can figure out how Will Wright designed a city simulation that could run on a Commodore 64. Since the simulation code is clearly delineated from the interface, open source, and C, I see no reason not to just dive in. A lot has been written about the design of SimCity -- and I'll make use of this throughout the article -- but I had questions about specific systems (the traffic system more than anything else) that are underdocumented elsewhere.

There are plenty of initialization steps, but we may as well skip ahead to the SimFrame function. Many functions call it, and it is one of the few simulation functions that can be called from the wrapper. It seems to simulate each tick of simulation within the engine.

 1    /* comefrom: doEditWindow scoreDoer doMapInFront graphDoer doNilEvent */
 2    SimFrame(void)
 3    {
 4        short i;
 5
 6        if (SimSpeed == 0)
 7        return;
 8
 9        if (++Spdcycle > 1023)
10        Spdcycle = 0;
11
12        if (SimSpeed == 1 && Spdcycle % 5)
13        return;
14
15        if (SimSpeed == 2 && Spdcycle % 3)
16        return;
17
18        if (++Fcycle > 1023) Fcycle = 0;
19    /*  if (InitSimLoad) Fcycle = 0; */
20        Simulate(Fcycle & 15);
21    }

After initializing a variable with a purpose that is unclear to me, it checks to see if the game is paused (SimSpeed == 0) and returns if so. It then iterates the current cycle after ensuring that doing so won’t cause an overflow. The way game speed is handled is clever. In the case that the Speed is 1, the simulation only runs every 5th frame. If speed is 2, it only runs 3rd frame. If speed is set to 3, it skips no frames. Interestingly, this means that the 4 speeds set the game to 0%, 20%, 33%, and 100% of processor speed respectively. Why not a Spdcycle % 2?

The last part is completely Greek to me. It iterates Fcycle – similar to Spdcycle except it only iterates on an "active" simulation frame – after ensuring no overflow, then runs the simulation with the input variable mod16, which is derived from a bitwise ‘and’ between Fcycle and 15 (1111 in binary). I ran this through a quick python script to see the result.

1    results = {}
2    for x in range(1024):
3        mod16 = str(x & 15)
4        if (results.get(mod16)):
5            results[mod16] += 1
6        else:
7            results[mod16] = 1
8    print(results)

The results showed that in 1024 cycles, mod16 will return each value between 0 and 15 exactly 64 times. Cool alternative to nested loops / multiple iterator variables. Also, converting a bitwise operation to a string felt weird. Like wearing a suit to a track meet.

After this is a beautiful function, Simulate

 1    /* comefrom: SimFrame */
 2    Simulate(int mod16)
 3    {
 4        static short SpdPwr[4] = { 1,  2,  4,  5 };
 5        static short SpdPtl[4] = { 1,  2,  7, 17 };
 6        static short SpdCri[4] = { 1,  1,  8, 18 };
 7        static short SpdPop[4] = { 1,  1,  9, 19 };
 8        static short SpdFir[4] = { 1,  1, 10, 20 };
 9        short x;
10
11
12        x = SimSpeed;
13        if (x > 3) x = 3;
14
15
16        switch (mod16)  {
17        case 0:
18            if (++Scycle > 1023) Scycle = 0;  /* this is cosmic */
19            if (DoInitialEval) {
20        DoInitialEval = 0;
21        CityEvaluation();
22            }
23            CityTime++;
24            AvCityTax += CityTax;   /* post */
25            if (!(Scycle & 1)) SetValves();
26            ClearCensus();
27            break;
28        case 1:
29            MapScan(0, 1 * WORLD_X / 8);
30            break;
31        case 2:
32            MapScan(1 * WORLD_X / 8, 2 * WORLD_X / 8);
33            break;
34        case 3:
35            MapScan(2 * WORLD_X / 8, 3 * WORLD_X / 8);
36            break;
37        case 4:
38            MapScan(3 * WORLD_X / 8, 4 * WORLD_X / 8);
39            break;
40        case 5:
41            MapScan(4 * WORLD_X / 8, 5 * WORLD_X / 8);
42            break;
43        case 6:
44            MapScan(5 * WORLD_X / 8, 6 * WORLD_X / 8);
45            break;
46        case 7:
47            MapScan(6 * WORLD_X / 8, 7 * WORLD_X / 8);
48            break;
49        case 8:
50            MapScan(7 * WORLD_X / 8, WORLD_X);
51            break;
52        case 9:
53            if (!(CityTime % CENSUSRATE)) TakeCensus();
54            if (!(CityTime % (CENSUSRATE * 12))) Take2Census();
55
56
57            if (!(CityTime % TAXFREQ))  {
58        CollectTax();
59        CityEvaluation();
60            }
61            break;
62        case 10:
63            if (!(Scycle % 5)) DecROGMem();
64            DecTrafficMem();
65            NewMapFlags[TDMAP] = 1;
66            NewMapFlags[RDMAP] = 1;
67            NewMapFlags[ALMAP] = 1;
68            NewMapFlags[REMAP] = 1;
69            NewMapFlags[COMAP] = 1;
70            NewMapFlags[INMAP] = 1;
71            NewMapFlags[DYMAP] = 1;
72            SendMessages();
73            break;
74        case 11:
75            if (!(Scycle % SpdPwr[x])) {
76        DoPowerScan();
77        NewMapFlags[PRMAP] = 1;
78        NewPower = 1; /* post-release change */
79            }
80            break;
81        case 12:
82            if (!(Scycle % SpdPtl[x])) PTLScan();
83            break;
84        case 13:
85            if (!(Scycle % SpdCri[x])) CrimeScan();
86            break;
87        case 14:
88            if (!(Scycle % SpdPop[x])) PopDenScan();
89            break;
90        case 15:
91            if (!(Scycle % SpdFir[x])) FireAnalysis();
92            DoDisasters();
93            break;
94        }
95    }

This a lot to go through, but it should hold pretty much everything that matters. I'll move piece by piece.

 1    Simulate(int mod16)
 2    {
 3        static short SpdPwr[4] = { 1,  2,  4,  5 };
 4        static short SpdPtl[4] = { 1,  2,  7, 17 };
 5        static short SpdCri[4] = { 1,  1,  8, 18 };
 6        static short SpdPop[4] = { 1,  1,  9, 19 };
 7        static short SpdFir[4] = { 1,  1, 10, 20 };
 8        short x;
 9
10
11        x = SimSpeed;
12        if (x > 3) x = 3;

The shortened naming convention makes things a harder than they need to be, and they didn’t make it any better in the C++/Python port. In this case, though, it seems clear that we have initializations for the fifth entry in the arrays SpeedPower, SpeedPollution(?), SpeedCrisis(?), SpeedPopulation, and SpeedFire. This is really weird. It then initializes “x” to be a version of SimSpeed which is capped to 3. I didn’t know it was even possible to have a higher value than 3 but whatever.

We then get to the “big switch” between 16 options. Case 0 has the magnificent comment “this is cosmic”. I assume this just means that reaching 1023 Scycles takes a while (16368 simulated frames), but I like to imagine this was an outburst of emotion immortalized in code. I’ll submit a pull request appending an exclamation point. It then checks if it needs to do an initial evaluation (this seems like a messier solution to this problem than I’d expect from the billion initializations this game has). It then iterates CityTime (the same as Scycle but with a variable starting point) and adds the player’s city tax to the average city tax. It then has a 50% chance to run SetValves(), and runs ClearCensus().

The next 8 cycles, half of the whole simulation, are spent on MapScan(). I wonder what that does.

 1    MapScan(int x1, int x2)
 2    {
 3        register short x, y;
 4
 5
 6        for (x = x1; x < x2; x++)  {
 7        for (y = 0; y < WORLD_Y; y++) {
 8            if (CChr = Map[x][y]) {
 9        CChr9 = CChr & LOMASK;  /* Mask off status bits  */
10        if (CChr9 >= FLOOD) {
11        SMapX = x;
12        SMapY = y;
13        if (CChr9 < ROADBASE) {
14            if (CChr9 >= FIREBASE) {
15            FirePop++;
16            if (!(Rand16() & 3)) DoFire();  /* 1 in 4 times */
17            continue;
18            }
19            if (CChr9 < RADTILE)  DoFlood();
20            else DoRadTile();
21            continue;
22        }
23
24
25        if (NewPower && (CChr & CONDBIT))
26            SetZPower();
27
28
29        if ((CChr9 >= ROADBASE) &&
30            (CChr9 < POWERBASE)) {
31            DoRoad();
32            continue;
33        }
34
35
36        if (CChr & ZONEBIT) { /* process Zones */
37            DoZone();
38            continue;
39        }
40
41
42        if ((CChr9 >= RAILBASE) &&
43            (CChr9 < RESBASE)) {
44            DoRail();
45            continue;
46        }
47        if ((CChr9 >= SOMETINYEXP) &&
48            (CChr9 <= LASTTINYEXP))  /* clear AniRubble */
49            Map[x][y] = RUBBLE + (Rand16() & 3) + BULLBIT;
50        }
51            }
52        }
53        }
54    }

These variables are a lot more difficult to read, though I do see a comforting 2D tile array: Map[][]. Let’s first parse through the potential input variables:

1    MapScan(0, 1 * WORLD_X / 8);
2    MapScan(1 * WORLD_X / 8, 2 * WORLD_X / 8);
3    MapScan(2 * WORLD_X / 8, 3 * WORLD_X / 8);
4    MapScan(3 * WORLD_X / 8, 4 * WORLD_X / 8);
5    MapScan(4 * WORLD_X / 8, 5 * WORLD_X / 8);
6    MapScan(5 * WORLD_X / 8, 6 * WORLD_X / 8);
7    MapScan(6 * WORLD_X / 8, 7 * WORLD_X / 8);
8    MapScan(7 * WORLD_X / 8, WORLD_X);

I can’t find anything useful on WORLD_X, but from this snippet in the ClearMap() function in s_gen.c:

1    ClearMap(void)
2    {
3      register short x, y;
4
5
6      for (x = 0; x < WORLD_X; x++)
7        for (y = 0; y < WORLD_Y; y++)
8          Map[x][y] = DIRT;
9    }

It definitely looks like it’s a constant representing the width of the map measured in tiles. I happen to know that the SimCity map is 120x100, so Let’s replace WORLD_X by 120. Therefore, the scans become:

1    MapScan(0, 15);
2    MapScan(15, 30);
3    MapScan(30, 45);
4    MapScan(45, 60);
5    MapScan(60, 75);
6    MapScan(75, 90);
7    MapScan(90, 105);
8    MapScan(105, 120);

These values – x1 and x2 – are always at a distance equal to ⅛ of the total map width. Obviously, it is scanning through the map in segments in order to spread processing power and random events throughout. These random events – like “DoFire()” and “DoFlood()” – seem to be mostly negative. DoRail() and DoRoad() correspond to deteriorating roads and rails. DoZone, on the other hand, seems to to an entirely separate and expansive simulation system in s_zone.c

MakeTraf() is the best-commented function in this entire codebase. I embarked upon this project to figure out how Will Wright dealt with traffic, so this is a huge blessing. Here’s the code:

 1    /* comefrom: DoIndustrial DoCommercial DoResidential */
 2    MakeTraf(int Zt)
 3    {
 4        short xtem, ytem;
 5
 6
 7        xtem = SMapX;
 8        ytem = SMapY;
 9        Zsource = Zt;
10        PosStackN = 0;
11
12
13    #if 0
14        if ((!Rand(2)) && FindPTele()) {
15    /* printf("Telecommute!
16    "); */
17        return (TRUE);
18        }
19    #endif
20
21
22        if (FindPRoad()) {    /* look for road on zone perimeter */
23        if (TryDrive()) {   /* attempt to drive somewhere */
24            SetTrafMem();   /* if sucessful, inc trafdensity */
25            SMapX = xtem;
26            SMapY = ytem;
27            return (TRUE);    /* traffic passed */
28        }
29        SMapX = xtem;
30        SMapY = ytem;
31        return (FALSE);   /* traffic failed */
32        }
33        else return (-1);   /* no road found */
34    }

Misspelled “successful” – that’s another pull request. FindPRoad() looks like this:

 1    /* comefrom: DoSPZone MakeTraf */
 2    FindPRoad(void)   /* look for road on edges of zone   */
 3    {
 4        static short PerimX[12] = {-1, 0, 1, 2, 2, 2, 1, 0,-1,-2,-2,-2};
 5        static short PerimY[12] = {-2,-2,-2,-1, 0, 1, 2, 2, 2, 1, 0,-1};
 6        register short tx, ty, z;
 7
 8
 9        for (z = 0; z < 12; z++) {
10        tx = SMapX + PerimX[z];
11        ty = SMapY + PerimY[z];
12        if (TestBounds(tx, ty)) {
13            if (RoadTest(Map[tx][ty])) {
14            SMapX = tx;
15            SMapY = ty;
16            return (TRUE);
17            }
18        }
19        }
20        return (FALSE);
21    }

Some quick comments: the values for where to find a road perimeter is hardcoded to 3x3 zones! Very interesting. It starts on the left end of the bottom and works counterclockwise. Interestingly, this would have real gameplay consequences: you can increase your buildings max searchable distance by only placing roads in the desired direction. TryDrive() looks like this:

 1    /* comefrom: MakeTraf */
 2    TryDrive(void)
 3    {
 4        short z;
 5
 6
 7        LDir = 5;
 8        for (z = 0; z < MAXDIS; z++) {  /* Maximum distance to try */
 9        if (TryGo(z)) {     /* if it got a road */
10            if (DriveDone())      /* if destination is reached */
11        return (TRUE);      /* pass */
12        } else {
13            if (PosStackN) {      /* deadend , backup */
14        PosStackN--;
15        z += 3;
16            }
17            else return (FALSE);    /* give up at start  */
18        }
19        }
20        return (FALSE);     /* gone maxdis */
21    }

Once we find any road along the perimeter, we begin a search along the road for MAXDIS iterations. If it reaches the type of destination it was looking for, it returns true. SetTrafMem() then increases traffic density along each tile traversed. This answers one half of my question, but not another: how do integrate these spatial relationships with the supply and demand model of the game? Let’s look back at DoResidential()

 1    DoResidential(int ZonePwrFlg)
 2    {
 3        short tpop, zscore, locvalve, value, TrfGood;
 4
 5
 6        ResZPop++;
 7        if (CChr9 == FREEZ) tpop = DoFreePop();
 8        else tpop = RZPop(CChr9);
 9
10
11        ResPop += tpop;
12        if (tpop > Rand(35)) TrfGood = MakeTraf(0);
13        else TrfGood = TRUE;
14
15
16        if (TrfGood == -1) {
17        value = GetCRVal();
18        DoResOut(tpop, value);
19        return;
20        }
21
22
23        if ((CChr9 == FREEZ) || (!(Rand16() & 7))) {
24        locvalve = EvalRes(TrfGood);
25        zscore = RValve + locvalve;
26        if (!ZonePwrFlg) zscore = -500;
27
28
29        if ((zscore > -350) &&
30        (((short)(zscore - 26380)) > ((short)Rand16Signed()))) {
31            if ((!tpop) && (!(Rand16() & 3))) {
32        MakeHosp();
33        return;
34            }
35            value = GetCRVal();
36            DoResIn(tpop, value);
37            return;
38        }
39        if ((zscore < 350) &&
40        (((short)(zscore + 26380)) < ((short)Rand16Signed()))) {
41            value = GetCRVal();
42            DoResOut(tpop, value);
43        }
44        }
45    }

The function begins by increasing the ResZPop (residential zone population) variable. Then it checks if the current tile is a free zone (which isn’t subject to zoning policies) or not. If it is, it adds the population from the DoFreePop() function. Otherwise, it adds the population from the RZPop() function. Then it checks if the population is greater than a random number (35) - if it is, it calls MakeTraf(), which finds a road and sets the traffic density along it.

Next, it evaluates the current tile using the EvalRes() function. This function checks the traffic conditions, pollution levels, and other factors to determine a score for the tile. This score is then compared to a random number - if the score is greater than the random number, the tile is zoned for residential use and the population is added. If the score is less than the random number, the tile is zoned out and the population is removed. Finally, there’s a small chance that the tile will be zoned for a hospital instead of residential. We could keep going building by building, but it would be a lot of the same. I think now that we've seen the general lifecycle of the simulation, we can start accelerating towards the bigger picture.

Chaim Gingold, a developer who worked with Will Wright on Spore, wrote on the design of SimCity as part of his PhD dissertation. Here’s a link, the relevant section starts on page 297³.

Gingold opens with a high-level representation of the simulation

He also includes a handy visual representation of the 16-step simulation structure. This includes the 8 map scan steps as well as the 8 simulation steps.

The biggest new insights here are at the bottom: a lot of the complexity I had difficulty parsing are cycle checks to delay scans on faster time settings. I don’t plan to have time settings, so I won’t investigate too much further. One thing though – the time settings on TakeCensus() and TakeCensus2() tell us the frequency of simulation: Every 4 cycles is a month, every 48 is a year.

Further, what I first thought was just a bizarre ID system turned out to be a vital part of the cellular automata system. The definitions of different buildings looks like this:

 1    #define HOSPITAL    409
 2    #define CHURCH      418
 3    #define COMBASE     423
 4    #define COMCLR      427
 5    #define CZB     436
 6    #define INDBASE     612
 7    #define INDCLR      616
 8    #define LASTIND     620
 9    #define IND1        621
10    #define IZB     625
11    #define IND2        641
12    #define IND3        644
13    #define IND4        649
14    #define IND5        650
15    #define IND6        676
16    #define IND7        677
17    #define IND8        686
18    #define IND9        689
19    #define PORTBASE    693
20    #define PORT        698
21    #define LASTPORT    708

And this corresponds to handlers like this:

1    CChr9 = CChr & LOMASK; /* Mask off status bits  */
2    if (CChr9 >= FLOOD) {
3        SMapX = x;
4        SMapY = y;
5        if (CChr9 < ROADBASE) {
6          if (CChr9 >= FIREBASE) {
7            FirePop++;
8            if (!(Rand16() & 3)) DoFire();  /* 1 in 4 times */
9            continue;

First, what’s up with CChr9? This was a question which was beating me up for a while. However, I’m proud to say I’ve figured it out (due to someone else figuring it out for me).

Each tile on the SimCity map uses 16 bits.

16 bits! The first 10 are a reference and everything else the game needs to know is in the last 6. My map tiles use 16 bits just for the index. Ridiculous.

The first 10 correspond to one of the 956 possible tile sprites, and the last 6 correspond to various boolean variables about the tile. Based on this “Masking off status bits” can be reasonably assumed to mean isolating the first 10 bits, and in this case LOMASK must look something like 1111111111 (1023) or 1111111111000000 (65472) depending on how C's masking works.

That clears that up. So CChr9 is just the spritemap location of each tile. Tile data is organized in such a way that groups of similar behavior are adjacent, so rather than assigning each ID a separate behavior we can just check a range. In the example above, we check if the ID is greater than “FLOOD” before giving a chance of catching fire. The ID map around FLOOD looks like this:

Since FLOOD is 48, all tiles below it are fireproof. I would argue that woods and trees should be in the flammable category, but the behavior is clear: anything naturally generated by the world won’t catch on fire, but anything the player places can.

Even crazier than the fact that there are only 16 bits per tile is that the tiles do not point to anything. This was the biggest revelation to me in this whole investigation: the original simcity did not keep track of what buildings were where. When buildings were demolished, they ran a function which found all the other tiles in the building and deleted them that way. There is a tile at the center of every zone which has ZONEBIT on, so when you need to operate only once per building (like to check power, create traffic, etc) you have a single tile to check. This is a bit inefficient because it means that every time you need to check buildings you have to check every tile in the map, but this could be easly solved by maintaining a list of all ZONEBIT tile coordinates. Maybe this already exists somewhere in the code, but I never found one.

I finally come to the answer to my original question, and it isn’t pretty.

I doubt I can do this justice, but I can try. For a better (and much more granular) explanation, check out the dissertation.

What is a valve? According to Gingold, it regulates and limits the flow of information between different simulation agents. This concept comes from Jay Forrester, an early influential computer engineer and the system dynamicist. The concept makes intuitive sense: if every time you added something to your city it immediately reached its economic equilibrium, it wouldn’t feel like much of a simulation at all. A valve is a way to simulate the slow, imperfect way that economic systems try to reach their end-point. You configure the reaction speed to new stimuli. RValve CValve and IValve update twice every month.

SetValve looks at ratios between the population of Residential, Commercial, and Industrial zones and sets the projected velocity, positive or negative, of those types of zones. For example, employment is based on a ratio between commercial+industrial and residential. It also takes into account things like the ratio of land value to pollution etc. It uses this to determine a “projected” population. A velocity towards that projected population is determined and then modified by the tax rate and game difficulty. Finally, a global valve is set to grow or shrink zones.

Here’s my problem with this: what happened to modularity? The cellular automata structure was so elegant, and now it seems like the RCI structure is moving back in the direction of whole-set calculation. The confidence of the cellular automata philosophy seems broken by this manual adjustment away from the suppy and demand structures present within each building. I’m sure this is a result of hardware limitation, but it does seem like a step backwards. The philosophy of a cellular automata model is very promising, particularly for scalability. In my project I hope to make a theoretically infinite map of interoperable cities, and whole set calculation would increase the complexity of the simulation significantly. However, removing whole set calculation from this model is no easy task. It requires a localization of all the economic forces of the city. Of course, this itself is similar to the SimCity 2000 model of simulating individual agents as a proxy for macroeconomic forces. All roads lead back to Austria apparently.
My ideal configuration might look like this:

Every zone gets its valve score by itself, the same way it might contribute to traffic. You have a set of commercial, industrial, and residential agents. Whether they are individual tiles or abstract objects shouldn’t matter too much. [3/30/2023: It ended up mattering a lot]

Residential zones need: Employment from Commercial and Industrial zones (money in) in order to buy Goods from commercial zones (money out)
Commercial zones need: Goods from industrial zones AND employment from Residential zones (money out) in order to sell them to residential zones (money in)
Industrial zones need: Employees from Residential zones (money out) in order to sell goods to commercial zones (money in)
Commercial zones and industrial zones are cyclical, whereas residential zones are linear⁴.

So we naturally want to start with residential zones. We start with individual node searches:

Residential zones search for jobs in their vicinity. The number of residents willing to work is dependent on distance⁵, and the percentage employed is then taken as an input. You end up with 4 distinct variables that must be stored, updated, and made accessible to other buildings:
1. Employment rate is the population divided by jobs found. When this is suitably high then the residential zone will attract more population.
2. Population is the number of people living in the zone. This takes in a number of map-level calculations on top of employment. Also doubles as Employment Capacity and Consumption Capacity
3. Population capacity is a valve variable determined by zone development. It starts at some arbitrarily low point. When the zone grows, this figure increases. When it grows, population capacity grows. If the population is significantly below population capacity, the zone shrinks.
4. Consumption is the number of goods purchased. When significantly lower than consumption capacity, increase demand for commercial zones. There’s more that could potentially be done here.
Industrial zones also record the employment they receive and take it as the output capacity input. They then search for Commercial zones within some vicinity to determine sales. Shorter distances marginally increase the sale value to the industrial zone. You get 4 variables here too:
1. Employment capacity is a valve variable. It starts at some arbitrarily low point. If pinged by a residential zone, an industrial zone can always immediately accept up to their employment capacity.
2. Employment is the number of workers gained from residential pings.
3. Output capacity allows for growth but also increases expectations. This is derived solely from the employment
4. Sales is required to maintain output capacity. If Sales dips below the output capacity, a penalty is put on employment capacity If it dips significantly below, the zone will degrade
Commercial zones record when they are pinged by residential and industrial zones. Because of the dual nature of commerial zones, you need 6 variables:
1. Employment capacity is a valve variable. It starts at some arbitrarily low point. If pinged by a residential zone, a commercial zone can always immediately accept up to their employment capacity.
2. Employment is the number of workers gained from residential pings.
3. Input capacity is the amount of product a Commercial zone can accept from an industrial zone. Determined by employment
4. Stock is the amount of product received from an industrial zone
5. Output capacity is the same as Input
6. Sales is determined by pings from residential zones. If sales is below output capacity, a penalty is put on employment capacity. If it dips significantly below, the zone will degrade.

This system is a noted simplification of the SimCity model using SimCity 2000 design principles. However, it is a good starting point for a simulation that can be scaled up to a theoretically infinite number of zones. The modularity also means that additional features, like power or water, can be added without too much trouble. Of course, this is an untested system and system theory problems are notoriously unpredictable. The only way to know for sure is to try it out.

Called "SimCity" from here on. On this internet it is variably called "SimCity 1989", "Sim City", and rarely, "Micropolis" ↩︎
Here is that article. I'm not quite sure what the purpose of it is, nor why it's in a journal of architecture, but it's pretty thorough. Also "every known computer format" is an impossible claim. Even charitably assuming that they mean every 8-bit or 16-bit consumer format currently in production, not every format could run C! Also, the word "known" is funny here. It implies the existence of yet-undiscovered, clandestine computer formats. ↩︎
On the PDF it's page 323. I hate it when the PDF numbers aren't synced with the hardcoded numbers. ↩︎
We could say instead that Commercial zones “need to sell to customers in order to buy goods from industry”, but Residential zones could NOT say that they “need to buy goods from commercial zones in order to get employed at commercial and industrial zones” ↩︎
For example, 100% of residents are willing to work within 5 tiles, 90% within 10, 66% within 20, 33% within 30, 10% within 40 and so on. ↩︎