| Deepfake: “a video or sound recording that replaces someone's face or voice with that of someone else, in a way that appears real.” |
| Over the last decade, humans have mastered a new action. As centuries of singing, cooking and laughing have shaped how we interact with each other, scrolling has become how we interact with ourselves. Monotonous and time-sapping, social feeds have become libraries that find the exact video we want to see in order to give us the exact hit of dopamine needed to scroll past it in search of another. |
| Watch. Scroll. Repeat. |
| Within the new world of short-form video, a new trend is emerging: AI-generated construction timelapses. Thirty-second snippets that show stereotypical (and computer-generated) scenes of broken infrastructure or informal urban settlements being rapidly flattened and converted into “future cities”. Short, fast and dangerously believable, these videos feed into our desire to see change. They do this not by exploring how to eradicate poverty but by proposing to physically erase it. |
| Ambition is transformed into seemingly plausible roadmaps to ‘fix’ real challenges within cities but without any exploration of consequence. These timelapses oversimplify the challenges of urban rehabilitation and risk worsening the perception of informal settlements altogether. |
| Watch an example here. |
The scalpel and the bulldozer
| Informal urban settlements house over 1 billion people around the world. As populations boom and outpace the supply of affordable housing, an increasing number of people are forced to resort to makeshift housing. Communities are built as a consequence of socio-economic exclusion from cities and often lack access to proper sanitation, safe water and the municipal infrastructure needed for healthy lives. In popular media and everyday chit-chat, informal settlements are routinely reduced to obstacles that are in the way of letting cities flourish. The possibilities of the land are of far more interest than providing the necessary infrastructure to the people already living there, and AI timelapses are the newest vehicle to explore this. |
| In these videos, common cities to ‘upgrade’ span developing countries and include the likes of Mumbai and Rio de Janeiro. Every city is depicted on the borders of an “urban uncanny valley”, accurate enough to be convincing to the common eye. No matter the location, the storyboard is the same. First, a shot of the location, often drowning in mountains of waste and dirt. It’s then swarmed by sanitation workers who race through the streets, bagging up trash until the streets are spotless, ready to welcome an army of yellow diggers, cranes and trucks that raze the entire settlement to the ground. The final phase shows the installation of drains and roads until, suddenly, the entire area is transformed into a model city complete with cafes and trams and the renewal is complete. |
| Watch enough of these videos and the patterns become evident. Their remit extends far beyond sanitation and fixing roads, but instead to the creation of exclusive neighbourhoods heavily informed by western design principles. Settlements are replaced by detached suburban villas on ‘block and grid’ masterplans. Congested streets are replaced with dual-lane roads complete with ornate lampposts and benches that wouldn’t feel out of place in Paris. The outcome is a form of hyper-gentrification, often closer to middle-class townships as opposed to sensitive settlement rehabilitation. |
| By taking two staggeringly different before and after versions of what a city can look like, AI timelapses write a false narrative of what responsible urban development actually is. It ignores the extensive community involvement that is required to build common ground and understand local needs. Settlement communities are not homogenous and sweeping masterplans are rarely effective. The absence of this thought process is not solely because the video format is too short to capture nuance but because real world examples of urban rehabilitation look nothing like what the timelapses produce. |
| AI timelapses speculate change from present to future, but under the same format of credibility. The flurry of movement on a screen is too fast to examine but believable enough to make sense. |
A question of data
| The world of image generation has faced deserved scrutiny in recent months. Prompts to generate images of houses or streets in different countries have produced results that exemplify stereotypes and ignore reality. Chinese cities covered in red paper lanterns and Indian houses with elephants and Mughal era arches show how AI can reinforce tired tropes, whereas the infamous Trump’s Gaza waterfront video shows what an unchecked prompting can create. The timelapses are not exempt. Streets teeming with chaos, mountains of trash and an end result that is not a reflection of reality but of its perception. |
| Modern image generators are trained on large datasets of image-text pairs and learn the associations between words and visual patterns. The models learn statistical regularities from training data rather than directly copying examples, finding correlations in order to produce an image or video that best suits the prompt. This methodology makes them susceptible to algorithmic bias; if the input had a bias, the output will too unless actively mitigated. One form of this is ‘representation bias’: datasets may underrepresent certain regions, communities, or visual environments. When the information for a region is not present, the models have to rely on assumptions and patterns which may not be wholly accurate. Gaps in data are bridged by existing examples of urban development which may not necessarily come from the same location. |
| Models can also be vulnerable to social and cognitive biases. If the data was already biased, this will appear in the imagery produced. Digital information with outdated stereotypes such as ‘mysterious and distant lands’ become the source and embedded cultural prejudices are heightened further. Bias also occurs outside the computer and in the mind of the user. A prompt such as ‘show me how a dirty riverside settlement can be transformed into something that looks like Amsterdam’ is by default going to result in biased output. While this may not be a direct result of an algorithmic bias, it reflects how today’s architecture media industry and education is founded heavily on celebrating North American and western European design. The definition of high quality has been historically reserved for certain regions and minimal analysis offered to developing regions and global contexts. |
The believability problem
| Billed as features of the AI age, increased speed and possibility are bugs in the world of deepfakes. The ability to describe and generate any image severs the crucial threads of investigation and consequence. A decent internet connection is all that is needed to put the ability to propose realistic urban redevelopment within reach. No training or sensitivity needed and the key steps of policy analysis, community consultations and design refinement are leapfrogged. |
| On the other hand, successful urban rehabilitation projects are built with the backing of years of professional and academic experience, requiring months of project specific research in order to propose an idea that is broadly feasible. Architects and planners are bound by laws, codes and regulation and misrepresentation can be punishable by law. By the time a proposal is made public, it has likely gone through a series of checks with the aim of weeding out problems at an early stage. |
| Then comes the issue of the images and videos themselves. Traditional architectural renders and flythroughs, no matter how realistic, only portray the finished product at one moment in time. They rarely venture into showing the construction process itself and for good reason, it is complex, fraught with risk and the chance of misrepresentation is much higher. Renders are designed to look perfect because they are designed to persuade. Without any of the scuffs and dents of real buildings, architecture visualisations maintain a safe distance from reality. This is a good thing. |
| AI timelapses are wholly different. Dating back to the 1800s, timelapses are tools we have become accustomed to and are widely used to transform incremental change into digestible content. Before generative AI, timelapses captured the movement from past to present and showed verifiable progress. By filming everything that happened, they were irrefutable sources of proof. Conversely, AI timelapses speculate change from present to future, but under the same format of credibility. The flurry of movement on a screen is too fast to examine but believable enough to make sense. Videos come with surprising attention to detail, be it drainage infrastructure or the hard hats on construction workers. The videos are scruffy and imperfect, just like real construction sites. If it has considered such tiny details, what reason is there to doubt the bigger picture? Within thirty seconds, the desire to erase informal settlements and their inhabitants is made to look entirely plausible. |
| These timelapses sit squarely within a new ecosystem of how we get our information. The dangerous addiction to short form video is reflective not only of shortening attention spans but the need to publish the most eye-catching media in order to gain seconds of viewership. Timelapses are emotive pills that are fed to viewers at a time of worsening trust in governments and rallies of blaming the ‘other’. There are few, if any, ways to prevent their spreading into the media world but AI timelapses are one of the brightest signs signalling that architectural communication is far too weak. Stories of genuine, verifiable urban rehabilitation and community first building have to be brought to the forefront of storytelling because today’s reliance on pretty pictures only scratches the surface. AI is here to stay but it is up to citizens and communities if it is allowed to define what is to come. |