Photography and Machine Learning: when Images Become Data Images
Goldsmiths University of London, Art and Politics, Dep. of Politics and International Relations, 2019.
Abstract
The increasing number of digital images produced and distributed today has been an argument of investigation for numerous scholars as well as the subject of vast literature on the usage and impact of such images on our society. While the pressing issue to the majority of scholars concerns the ability of the digital image to represent reality, an increasing amount of literature questions the role of digital images in the broader, networked context of artificial intelligence and machine learning. This paper traces the evolution of traditional photography to data images to describe the shift in the essential nature of images from the way they were consumed by the public to becoming a source of raw material to feed sophisticated autonomous learning machines that are able to recognise, categorise and classify objects, humans and spaces.
The paper analyses the use of image as evidence tracing a chronology of photography in relation to technological progress. The ability of computer vision combined with the continuous stream of images and machine learning capabilities offer a wide range of beneficial possibilities, but this technology can also be misused and raises numerous ethical questions. This paper, rather than attempting to ascertain an answer to this question, seeks to display its complexity to underscore and reiterate the risks of relying entirely on machines. Unconditional trust in technology not only does not protect us from the probability of human error, but it also reduces our critical capabilities.
Introduction
At the beginning of September 2018, I was heading from Moscow to North Korea on assignment for National Geographic Magazine with a one-day stopover in China. The frustrating and annoying things that any traveller may face happened to me that day. After arriving by taxi at my hotel in Beijing, I left the backpack with all my equipment and money on the back seat, and in the few minutes that elapsed before I realised it, the taxi had already disappeared in the heavy traffic. I had no receipt or any evidence of my presence in that taxi from the airport to the hotel. My backpack was lost. My Chinese friend Yan appeared nevertheless relatively certain that we would recover my things. She called a friend working in the traffic department of the Beijing police and provided him with the time that I entered the taxi at the airport and the time that I arrived at my hotel. We had dinner, and the next day I left Beijing for the secretive Pyongyang with a new camera and little hope about my loss. I would discover on my return to Beijing 10 days later that my backpack had been found in less than three days.
The procedure utilised by the police was, in fact, simple: once they entered the information about the time and place of departure and arrival of my taxi, an artificial intelligence (AI) system scanned thousands of videos and images acquired by the seemingly ubiquitous closed-circuit television (CCTV) cameras distributed in Beijing to recognise the footage displaying which taxi I exited in front of the hotel. Once the image was located, the taxi licence plate number was retrieved, the driver was contacted, and my backpack was recovered. However, my immediate sense of joy was also troubled for a few days by the way it was found. If the efficiency of the Chinese police had, on the one hand, something miraculous, then it also had, on the other hand, something sinister.
The technological capacities of image acquisition and interpretation through AI systems and machine learning available today certainly have, like many technologies of the past at the time of their discovery, miraculous aspects; however, the capacity of a state, or any power structure, to exercise control over its population through the usage of technologies and to justify such in the name of collective security has something menacing and must be viewed with suspicion. As Bridle observed, ‘Over the last century, technological acceleration has transformed our planet, our societies, and ourselves, but it has failed to transform our understanding of these things’ (2018, p.2). The technological promise of computer vision in acquiring images and analysing them has as many potential benefits as troubling issues, not dissimilar, in fact, from the promise of its analogical ancestor from over a century ago: photography.
Since the advent of the digital age in the 20th century, we have lived in a time of great advancements in which technologies facilitate access to information and in which photography, as an ‘ally of technology from the moment it was invented’ (Grundberg, 1999, p.222), has a special role in the acquisition, diffusion and interpretation of such information. However, if analogue photography was an instrument to represent the world and society (and to a certain extent, a portion of digital photography continues to do so), then the proliferation of digital images has transformed the essential nature of images. Moreover, in a time when images are transformed into digital information not only by the process of converting visual signs into binary data but also by becoming readable by sophisticated systems of computer vision, which is able to classify and recognise patterns from images, we must investigate both the benefits and the potential misuse of this technology.
The paper relies on a mixed methodology of analysis of photographs and literature regarding photography to outline its the development from analogue to digital. I argue that this is a crucial passage to the quick shift from images as a form of output representing individual and collective observations of the world to images as data or a source of raw material to feed the mechanisms of machine learning and AI. The focus of this paper centres on specific photographic examples and literature on cybernetics and AI to analyse troubling aspects of computer vision and machine learning technology when they are applied to images. Throughout the paper, I position myself as a participating observer, and my own experience in the field of photography is, therefore, tempered with a critical approach to the usage of photography and new digital technologies.
I believe that the technical aspects of early analogue photography intrinsically possessed the qualities that would lead today to the automated process of image recognition and classification, and that even more certainly, the transformation from analogue to digital has led to the transformation of the essential nature of photography as well as our perception of images and their usage. If it is true that a part of digital photography continues to perform its functions for the public—the functions of analogue and traditional photography—then the appearance of digital images is only the aesthetic vestige of our habit of observing the world. The transformation of images into data images changes their meaning and functionality. Moreover, their usage as fuel for machine learning and AI systems also has implications that are more significant, and in some cases more worrying, than what photography has done in the past. This research does not intend to denigrate or diminish the beneficial aspects of new technologies that combine the usage of images and machine learning; rather it seeks to reveal some of the complexities of such systems when applied in contexts of surveillance and societal control.
From the Pencil of Nature to Data Images—Changing the Nature of Images
In 1846, in his public presentation of the first photographic images, ‘The Pencil of Nature’, Talbot defines the pictures as ‘obtained by the mere action of light upon sensitive paper. They have been formed or depicted by optical means alone and without aiding anyone acquainted with the art of drawing’ (1844, p.36). The newly discovered technology promised to depict reality in the way it is and as such, would be indisputable because, in the process of producing the image, the human—and therefore the chance of human error—is virtually removed. Since Talbot’s first result, photography has been widely utilised as the most efficient form of visual communication and diffusion of information. Photographs depicted diverse aspects of our society, and for many, they remain the most reliable representation of reality. We well know today that photographs are not the ‘perfect transparent media through which reality may be represented to understanding’ (Mitchell, 1984, p.503); rather, they are ‘an opaque, distorting, arbitrary mechanism of representation, a process of ideological mystification’ (Mitchell, 1984, p.504). Nevertheless, for years on, and somehow till today, ‘because it is convincing, because it has an evidentiary value in the courtroom of reality, photography has prospered and multiplied’ (Grundberg, 1999, p.224). Even before the advent of digital images, and despite being largely criticised for the way photographic images may convey distorted and manipulated messages, photography continued to flourish because of the technological advances that have made the practice of taking images easier and faster and increased the speed of reproducibility and diffusion. The phenomenon of photography democratisation, which began in the 1960s with the diffusion of consumer cameras, exploded with the advent of digital photography only slightly more than 20 years ago. Not only digital cameras, but also computers, the advent of the Internet and the spread of mobile devices laid the foundation for digital photography in a multifaceted, digital networked context. The evolution of digital photography also dismantled the indexical relationship between photographic images and reality. In fact, in contrast to the process of analogue photography, digital images allow significant manipulation and processing, thus transforming photography from an object into data. However, if the ability of the digital image to represent reality has been the pressing issue among scholars, then the issue of the digital image cannot be seen only from the perspective of its perception at the level of the individual image.
The merging of photography with the Internet and digital technologies in general have had a notable effect on varied and diverse social and cultural processes and institutions including medicine, journalism, law enforcement, tourism, space exploration and fine art. (Rubinstein and Sluis, 2008, p.9, my italics)
Today, almost a century after Benjamin’s prophetic text ‘The Work of Art of Mechanical Reproduction’, in which some of the consequences of the overabundance of images had already been traced , we live in an age dominated by images in which digital technology has dematerialised photography, which today consists of a series of purely visual data—content without physical substance, images without a body.
If this is true for the 4.7 trillion (Marr, 2017) images archived in 2017 alone, then it is undoubtedly even truer for digital images that are not produced with the aim of a playful, human consultation. These are the billions of ‘operative images’ (Farocki, 2004, p.660)—the digital images produced by the myriad sensors that surround our lives. These images not only lack the materiality that photography once possessed, but they are also images that are ‘invisible’ to human perception.
In his essay ‘Toward a Philosophy of Photography’, Flusser defines the images produced by a camera as ‘technical images produced by apparatuses’ (2005, p.14). This category of images that has its roots in the Renaissance’s experiments of camera obscura and in the usage of the technical perspective described by Alberti include not only photographs, but also a broad spectrum of images, digital and otherwise, from the world of science, media, medicine, art and others. Therefore, as Flusser’s definition suggests, there is no distinction between the digital or analogue origin of technical images, because all apparatuses are, in fact, ‘calculating machines’ (Flusser, 2005, p.31) that ‘automate decision making and execution of material and symbolic object simultaneously’ (Beshty, 2018, p.25). Therefore, the mathematical nature of digital images is not only the result of the recent transformation from silver grain to pixel but is also rooted in the primordial nature of technical images. If Flusser admits that when photographers ‘look through the camera into the world [. . .] they are pursuing new possibilities’ (2005, p.26), then he also suggests that, ‘We have new kinds of images. We have images that visualise form[s] of thinking. There are numerically generated images which, let’s say, make platonic form[s] visible on [the] monitor’ (Flusser cited in Krtilova, 2016). The digital transformation of images has augmented the transformation of images into pure data.
To summarise, therefore, three characteristics allow us to consider the photographic image no longer as only a visual representation of the real world but rather as a large pool of visual information transformed into readable, classifiable and evaluable data: first, the technical and mathematical nature of the photographic process; second, the evolution and transformation of the photographic image from a material object necessarily correlated to the real object photographed to strings of zeroes and ones stored in a computer; third, the characteristic, simplified reproducibility of the photographic image enhanced by digital technology.
Digital images are easier to reproduce, multiply and manipulate. Because the feed of digital images forces us to digest more images without concentrating on single photographs, the singular, informative value of the picture is reduced. Nevertheless, in the abundance of digital images, we must also acknowledge that the value of digital images is no longer in their individual appearance, but rather, as Henning noted, ‘the end of single and singular images seems to announce the irrelevance of approach that treats the image as representation as well as practice of close reading’(2018, p.134). Paradoxically, while the function of individual images as carriers of cultural messages decreases, ‘within the flow of images the value is replaced by the notion of the stream of data in which both images and their significance are in the state of flux’ (Rubinstein and Sluis, 2008, p.22). Rather than being absorbed in or indulging on the information delivered by the individual image, the flow encourages us to move forward, ‘our attention shift from singular photographic image to image sequences: the image “pool”, the “slideshow”, the “photo stream”, the image “feed’’’ (Rubinstein and Sluis, 2008, p.14).
Reducing the Chance of Error
The literature on the theory of photography regarding the transformation of the image from analogue to digital focussed particularly on whether the latter possessed the veracity of traditional photography. The discussion, both at the level of general knowledge and in the academic sphere, is polarised—from the notion that no photograph represents the truth to that which believes that all photography represents the truth. It would be inappropriate to deny digital photography the same qualities of analogue photography. In fact, if it is true that digital photography is subject to simpler manipulation techniques precisely because of its digital nature, then the same manipulations could have been performed, albeit with greater difficulty, in the darkroom. The problem of the digital image, therefore, does not lie in its veracity or falsity, and it is not in the interest of this paper to delve beyond such questions; the problem lies in its material transformation.
Although a large part of digital photography continues to resemble a traditional photograph, digital transformation radically changes the function of the image itself. Stripped of its materiality and transformed into data, the digital image becomes a data image—data that resembles images. Once this transformation has occurred, we must, thereafter, ask ourselves what the functions and usages of these data images are.
Unlike traditional images, data images do not encapsulate their value in their appearance; rather, their value is in their mathematical and digital nature. Their resemblance to traditional images—their visual appearance—are simply vestiges of an old tradition in which to conceal their data nature. The significance of data images resides in their speed of diffusion and popularity. In an age dominated by data, images become a valuable currency, and in their abundance lies their value.
If traditional photography has revealed its fallacy and the reading of individual images has proved too subjective, paradoxically, digital images, stripped of their representational value and transformed into statistical data readable by machines, promise instead to be a constant and consistent flow of valid information.
From aerial photography to satellite imagery
On May 5, 2019, CNN published a satellite image depicting the exact moment of the launch of a short-range ballistic missile from the Hodo Peninsula in the Democratic People’s Republic of Korea (fig. 1). The image was distributed and automatically captured by one of the 351 satellites orbiting our planet; this satellite is owned by PlanetLab, a private company with the world’s largest constellation of Earth-imaging satellites that are able to capture images anywhere on Earth daily at 3-metre and 72-centimetres per pixel resolution. It provides services and geocoded analysis to third-party companies and states for both civilian as well as military purposes. Satellite and aerial images have been largely utilised for scientific and military reasons for decades, but because of digital technological advancements, not only has the number of orbiting satellites over our heads increased (as has the number of cameras circulating), but also the number of satellite images has grown.
Once the prerogative of a small number of sovereign states, these images are now available to a clientele of private consumers. While the overwhelming number of images on the feeds of our social network pages may be annoying, I am reliably sure that the reader can understand that the circulation of constantly updated, high-resolution satellite images capturing every corner of our planet on sale for financial gain is troubling. Rather than discussing now in detail the beneficial or improper usage of this specific category of images, the intention of this chapter is to chronicle how the aerial image became part of a protocol and process as well as a mechanical assembly foreshadowing the satellite digital images to come.
Aerial imagery is the combination of two quintessential modern technologies: map grids and photography. Not accidentally, the two became instrumental during World War I, and both evolved technologically to gradually increase representational accuracy. Nevertheless, despite the promise to carefully represent the reality ‘photographs, like maps, represent only a mediated version of the truth’ (2019, p.10) and their technological advancement has not guaranteed any objectivity or neutrality. In his essay, ‘The Instrumental Image: Steichen at War’, Sekula carefully analyses the usage of aerial images and how their meaning relates to the way they have been utilised. ‘Aerial photographs were expected to provide enough coverage, details, and evidence of systematic change to permit the construction of a valid theory of enemy strategy’ (2016, p.36). The photograph overlaid on the grid of the map is stripped of its indexical sign to assume the instrumental ‘intelligence’ purpose. Moreover, he continues, the ‘“reading” [. . .] consisted of a mechanical coding of the image’ (Sekula, 2016, p.36); the process of analysis also required continuous updating of the images (each surpassing and outdating the previous ones), in which the operational efficiency was linked to the speed of the reception of the images and their reading by a substantial number of military experts, cartographers and photographers.
Aerial images’ applications, in fact, anticipate the irrelevance of the individual image’s indexical value in favour of abundance and the flow of images. Moreover, it prefigured the dynamics of the digital scenario of surveillance from satellites to CCTV.
The digital evolution of satellite imagery not only accelerates the process of acquisition, distribution and analysis of virtually any part of our globe continuously and infinitely, but also because of its visual resemblance to its ancestor, aerial photography, ‘promises to offer transparent insights into major processes of the world’ (Shim, 2018, p.268). The reality in this case is that the visual appearance of satellite images as ‘real’ photographs hides the technical complexity process that accompanies their production. In truth, only after the digital processing of the acquired raw data ‘can satellite imagery acquire their proper visual appearance’ (Campbell, cited in Shim, 2018, p.269). ‘Post-processing is due to the vast amount of data involved, which are transmitted through the electro-optical sensor of satellites to terrestrial stations, where they are converted into a visual forms of analysis’ (Marmor, cited in Shim, 2018, p.270).
This conversion from data to images not only demonstrates the mathematical nature of digital images but is indeed also marginal today: Harun Farocki called such images ‘made neither to entertain nor to inform, operative images. These images [. . .] do not represent an object, but rather are part of an operation’ (2004, p.660). As in the case of aerial photography during World War I, satellite images initially required professional interpretation, and for that reason, the acquired data images demanded conversion into a visual representation. Today, the newest technologies like computer vision, AI and machine learning, do not require this reconversion because they interpret the information directly from the acquired data. The need to make satellite images perceived as ‘real’ photographs is purely of a popular nature; like many digital images, satellite images resemble photographs and promise us the same accuracy and comfort that we appreciated in Talbot’s photographic tables at the beginning of the last century, while the mathematical nature of the data image guarantees the efficiency and certainty of computer science.
The fact that the image depicting the North Korean missile was distributed publicly should be then more alarming than reassuring. Images such as the one available on PlanetLab do not have the purpose of representing space and geography in pictures as individual objects; rather their aim is to collect large datasets in which each image updates the previous one with fresh data. The feed of data images is ingested in a continuous loop with the intent to quantify, calculate, classify, create a pattern of recognition and execute actions, while potentially removing human error from the process: ‘with Planet’s machine learning-based analytic feeds, intelligence groups can detect change and prioritise resources quickly based on actual need’ (‘Daily Satellite Imagery and Insights’, 2019).
Images as evidence
In 2000, American artist Taryn Simon realised a series of 50 portraits of individuals who were wrongly convicted, arrested and subsequently freed from death row and thereafter exonerated utilising DNA evidence. ‘The Innocent’ is a striking collection of beautifully captured photographic portraits and testimonial captions by the sitters that examines the ambiguous ability of photography to blur reality and fiction, challenging the usage of photographs as evidence in the judicial system.
The primary cause of wrongful conviction is mistaken identification. A victim or eyewitness identifies a suspected perpetrator through law enforcement’s use of photographs and lineups. This procedure relies on the assumption of precise visual memory. However, through exposure to composite sketches, mugshots, Polaroids, and lineups, eyewitness memory can change. In the history of these cases, photography offered the criminal justice system a tool that transformed innocent citizens into criminals. Photographs assisted officers in obtaining eyewitness identifications and aided prosecutors in securing convictions. The criminal justice system had failed to recognise the limitations of relying on photographic image (Simon, 2000).
By making the portraits ‘at a site that came to assume particular significance following a wrongful conviction: the scene of misidentification, the scene of arrest, the alibi location, or the scene of the crime’ (Simon, 2000), Simon confronts photography’s ability to blur truth and fiction, and suggests that the interpretation of photography depends on the context in which the photographs are seen. By demystifying her own pictures that re-enacted, for example, the scene of the arrest, and accompanying the photograph with a lengthy caption that reveals both the reality of the circumstances and the possible visual misunderstanding, Simon also demystifies the meaning of the images utilised by the police to build the indictment and justify the sentence of guilt. In the case of Larry Mayes, for example (fig. 2), who was portrayed hiding between two mattresses in his room when he was arrested, Simon reports in the caption: ‘The victim identifies Mayes in a photographic array, but only after failing to identify him in two lineup procedures’ (Simon, 2003, p.70).
The juxtaposition of photographs and captions reveals the relation between truth and fiction, reinforcing the historical debate about photographs’ ability to depict ‘reality’ and the witnessing power of a photograph. Moreover, the caption reveals the victim’s crucial role in the identification process. In fact, it is not the photograph that produced the mistake, but as in the example presented, the victim who mistakenly recognised Mayes as his attacker through an image.
If it is true that photographic images are sometimes admitted as evidence in trials, and that ‘the belief in a special relation of photographs to the visible world provided the crucial premise favouring the admission of photographs into evidence’ (Snyder, 2004, p.217), then it is also true that their admissibility has found resistance since the birth of the photograph itself. As Snyder summarises, ‘Photographs today are routinely admitted into evidence because the people who made them can be questioned in the witness box’ (2004, p.217). The legal and testimonial value of the photographs does not lie within the image itself but in the witness’s statements.
Nevertheless, cameras and increasingly, their recent digital successors are expected to be a tool for analysis that often results in an instrument of social control. Police photography, for example, reached its peak at the end of the 19th century with the work of Alphonse Bertillon. Bertillon established a photographic documentation programme, the result of which would become a series of comprehensive cataloguing albums. ‘The plates contained summaries of all the fragmented and accumulated facial elements that provided traits that could be recombined giving rise to an infinite number of existing and possible semblant’ (Fontcuberta, 2012, p.75).
Bertillon’s system preconfigures what today’s digital technology enables in an agile and effective manner. Once the fallacy of a single photograph is accepted, the increased abundance of visual sources becomes the system that promises to guarantee the reduction of the probability of error: more images means more possibilities for cross-checking. Moreover, the classification by fragmentation developed by Bertillon becomes not only the method utilised by the police for the classification and identification of criminals and missing persons, but also as we see in more detail in the next chapter, the basis on which the contemporary technology of facial recognition and other forms of computer vision are founded.
These new technologies are fundamental in crime prevention and forensic analysis techniques. Like all images, those produced by digital systems do not have specific malicious or benign characteristics, and their meaning lies in their usage and reading. Nevertheless, in the rhetoric of the new security and surveillance context, the new visual acquisition devices capture incremental moments in everyone’s lives. ‘Whilst under surveillance, everyone is a potential criminal. Criminology has not yet explained what these machines actually produce, or what they prove, or what is the value of their evidence’ (Biber, 2005, p.37).
The individuals portraited by Simon were exonerated utilising DNA evidence as she noted in the foreword of her book:
Only in recent years have eyewitness identification and testimony been forced to meet the test of DNA corroboration. Eyewitness testimony is no longer the most powerful and persuasive form of evidence presented to juries. [. . .] In our reliance upon these new technologies, we marginalise the majority of the wrongfully convicted, for whom there is no DNA evidence, or those for whom the cost of DNA testing is prohibitive. Even in cases in which it was collected, DNA evidence must be handled and stored and is, therefore, prey to human error and corruption. Evidence does not exist in a closed system. Like photography, it cannot exist apart from its context, or outside of the modes by which it circulates. (2003, p.7)
The usage of new technologies to supplant previous ones is a common practice in the development of humanity. Replacing the eyewitness with the DNA test is the attempt to reduce, in court, the probability of human error—to replace subjectivity and bias with science. Paradoxically, in transforming images into data images, we proceed with the same logic. As previously described in this chapter, by transforming images into data images, we no longer consider images merely for their appearance and visual function, but as carriers of data that can be analysed, verified, organised and classified. If images become thus the source of data, the question to be answered is how we will manage this almost infinite amount of data.
Reading the Images—from Human to Machine Learning
Super recognisers and redundant images
In the attempt to solve the poisoning of Sergey Skripal, the investigator on the case collected 11,000 hours of CCTV footage. Britain is one of the most surveilled nations in the world ‘with an estimated one surveillance camera per 11 citizens’ (Barry, 2018). To narrow the number of suspects, detectives turned to a specialised unit in London’s Metropolitan Police called ‘Super Recognisers’. The term was coined by Prof. Richard Russell of Harvard University to define ‘people with extraordinary face recognition ability’ (Russell et al., 2009, p.252) while conducting a study on prosopagnosia, which is ‘a condition in which patients are unable to recognise human faces’ (Keefe, 2016). Although Super Recognisers have proven effective in identifying numerous criminals between 2009 and today, recent research ‘suggests the ability might be genetic—that this skill is hard-wired, somehow, within 1–2% of the population’ (Moshakis, 2018). The potential human resources available appear nevertheless to be insufficient to review the images produced by the 5.9 million CCTV cameras in Great Britain alone. Even under the assumptions that only a selected amount of footage and images would be preselected for specific investigation and that Super Recognisers would be utilised only in particularly urgent cases, what would be the criteria that would guarantee the correctness of the identification? As in the cases exposed in the previous chapter in the work of Taryn Simon, reliance on human ability in the process of criminal identification may lead to misjudgement and error. This appears to be the case even when the task is conducted by a small number of experts with extraordinary skills. While it has been claimed that Super recognisers can make societies safer and fairer by improving the accuracy of facial identification, ‘the current understanding of superior face processing does not justify widespread interest in Super Recognizers’ deployment’ (Ramon et al., 2019).
If identification through images has indeed demonstrated its fallibility over the years, because of both the ambiguous nature of the images and the probability of a subjective human interpretation and error, then the accuracy of facial identification through images is regaining an increasingly important role in surveillance to prevent miscarriages of justice and to aid criminal investigations. This should come as no surprise because the usage of images combined with the new digital technologies of computer vision offers numerous opportunities in the field of forensics as well as in political, commercial and scientific applications. The abundance of images in circulation today and their nature as data images, as described in the previous chapter, are an inexhaustible basin from which to draw information to store, record, organise, classify and evaluate. As Shoshana Zuboff observes,
Nothing is too trivial or ephemeral for this harvesting: Facebook ‘likes,’ Google searches, emails, texts, photos, songs, and videos, location, communication patterns, networks, purchases, movements, every click, misspelled word, page view, and more. Such data are acquired, datafied, abstracted, aggregated, analysed, packaged, sold, further analysed and sold again. (2015, p.79)
While other forms of data may be simpler to organise and classify, images, as previously described in this paper, require additionally sophisticated techniques of reading and interpretation, and their profusion makes the task impossible at the human level.
The ambition to liberate the human being from heavy and repetitive tasks has roots in the Industrial Revolution, but the replacement of man with machines was limited to specific areas in which the machine imitated and reproduced humans’ or animals’ bodily functions. As Manovich has noted,
The idea of computer vision became possible [. . .] only with the shift from industrial revolution to post-industrial society after World War II. The attention turned from automation of the body to the automation of the mind, from physical to mental labour. This new concern with the automation of mental function such as vision, hearing, reasoning, and problem solving is exemplified by the very name of two new fields that emerged during the 1950s and 1960s: artificial intelligence and cognitive psychology. (1997, p.8)
Norbert Wiener, the founding thinker of cybernetics and communication theory, introduced the idea of a possible comparison between human and machine in executing mental tasks. Weiner bases his intuition on the second law of thermodynamics whereby nature tends toward entropy and man as much as machine ‘both [. . .] exemplify a locally anti-entropic process’ (1989, p.32). His definition of feedback as the ‘control of a machine on the basis of its actual performance rather than its expected performance’ (1989, p.24) acknowledges the importance of future development in society to ‘produce a temporary and local reversal of the normal direction of entropy’ (1989, p.24).
If we apply the same principle to the contemporary scenario of an overabundance of data images, then the system of image flow is tending toward disorder, thus preventing the analysis of the single image. The ‘background noise—the message contaminated by external disturbances’ (Wiener, 1961), or the lack of accuracy of the message transmitted by the data images, is fought precisely through its redundancy. Photography in the form of data images is ‘the background noise of consumer culture’ (Rubinstein and Sluis, 2008, p.23): ‘the more copies of a message sent, the more likely, through statistical analysis of the message, that the original message might accurately convey’ (Beshty, 2018, p.27).
The Super Recognisers’ attempt to organise thousands of CCTV images in the search for a suspect can only be realised for a small number of circumstances, and as previously described, does not guarantee enough reliability. On the contrary the tirelessness of the machine not only guarantees a continuous cycle of work and analysis, but it also promises to ensure the accuracy of the result in the redundancy of the data images and their statistical analyses.
Computer vision and facial recognition
With the invention of photography, humans succeeded in fabricating a machine that can mimic the way we see; however, the act of seeing is not only the ability to record a scene, but also ‘vision is the interpretation of images that lead to actions or decision’ (Learned-Miller, 2011, p.2). The ability of a camera to record images, therefore, does not match the ability to see. Computer vision, hence, must be understood as the ability to see and understand the content of images in the way humans do. As for humans, the readability and interpretation of images requires intelligence. In the case of the computer, such intelligence is generally described as AI, and as Manovich notes, the term ‘may refer simultaneously to two meanings of intelligence: reason, the ability to understand, and information concerning an enemy or a possible enemy or an area. Artificial Intelligence: artificial reason to analyse collected information, collected intelligence’ (Manovich, 1997, p.8). From this perspective, it should therefore follow that, although computer vision technology today has many applications, the main usage is the surveillance sector. As described in the previous chapters, traditional surveillance systems continue to depend in part on human capabilities to analyse images, but thanks to the abundance of images, their current digital nature and the advanced computational capacity of algorithmic systems, many techniques can be deployed to implement automatic surveillance, such as facial recognition.
In contrast to image processing, which analyses the images on pixel level and generally accompanies the process of acquisition and elaboration of images into digital form, computer vision can see mathematically things that humans cannot see as well as recognise features while detecting and classifying patterns. Pattern recognition is ‘the act of taking raw data and making an action based on the category of the pattern’ (Duda et al. cited in Mäkinen, 2009, p.5). While numerous techniques have been implemented to apply pattern recognition and classification to computer vision, recent technology implies the usage of AI and requires three sequential steps: first, inspect part of the image via an algorithm; second, obtain extracted features; and third, classify them utilising machine learning. Facial recognition technology has been developed on the basis of different machine and deep learning technologies, but in general, the process requires the computer to develop an automated learning capability through a technique known as CNN (Convolutional Neural Network), an integer that measures when two functions overlap and provides the machine’s learning condition based on the training data introduced. ‘In case of face analysis, machine learning means that the computer adapts to the classification problem so that it can distinguish faces from non-faces, identifies the faces, genders of the faces and so on’ (Mäkinen, 2009, p.5). In other words, the computer learns from examples, and the learning proceeds in such a way that the classification occurs through the repetitive comparison between the sample set and the desired result. The more data images that are introduced into the network, the more accurate the result is claimed to be.
It is not the intention of this paper to discuss in more detail the technicality of facial recognition, yet it is precisely in its technical complexity that one can already recognise the intricacies of its practical usages. The problems of the facial recognition system lie in the complexity of the system itself: first of all, the classification system is based on the quantity and composition of the dataset (the images inserted); according to the system it is configured to develop and learn by itself and therefore intended to entertain a closed system in which the criteria of classification, control and output are not fully verifiable. Moreover, computer vision technology performs in such a way that every ‘representative image can become part of a surveillance system even if it was not produced with the intention of being so—especially if these images are digital’ (Saugmann, 2018, p.288). Whether the images are acquired by state surveillance system devices or private companies, voluntarily posted on a social network or extracted from the Internet or stock image databases, all these images—these data images—become part of a networked system that can be analysed, classified and linked to a virtually infinite number of additional social, economic and geo-localisation data.
In 2009, Prof. Li Fei-Fei of Stanford University created a dataset of images called ImageNet, which today contains more than 14 million images sorted by almost 22,000 synsets. The database is among the largest dataset of organised images in the world to train algorithms and to index, retrieve and annotate multimedia data and develop a more accurate system of computer vision.
The dataset also became the object of a competition, seeing researchers aim to elaborate improved algorithms for image classification. In the latest edition of the contest in 2017, the error rate of image classification was set to 2.25%, guaranteeing the leadership for three consecutive years to Chinese teams.
In July 2017, China announced its ambition to become the world leader in AI innovation and is ‘embracing technologies like facial recognition to identify and track 1.4 billion people’ (Mozur, 2018). Surveillance and facial recognition software in China can rely not only on the 300 million cameras planned for installation by 2020 but on the fact that ‘Chinese technology giants are taking advantage of the breadth and depth of data to build extremely precise profile of users’ (Ding and Gong, 2019, p.137). Many Chinese technology start-up companies, such as SenseTime, Yutu, Megvi and others, offer different software ‘to police bureaus to help them identify faces, crowd movement, car license plate numbers, and vehicle types’ (Horwitz, 2018). In the name of safety and security, the same technologies that allowed me to recover my forgotten backpack from a taxi in Beijing is utilised to profile Chinese citizens into a system of social credit and to recognise, distinguish and classify drug addicts, jaywalkers and criminals, as well as surveil and control daily activities of individuals.
The technology in China is being utilised to create a new system of social credit that is developing ‘comprehensive, constantly updated, and granular records on each citizen’s political persuasions, comments, associations, and even consumer habits’ (Diamond, 2018). Moreover, the Chinese government elaborated a sophisticated algorithm which ‘support facial recognition to identify Uighur/non-Uighur attributes’ (Mozur,2019) to distinguish the Muslim Uighurs minority from the majority of the Han population and is explicitly targeting the ethnic group with arbitrary detention and repression. While in Western countries, the diffusion of facial recognition technology has raised concerns because of the problem with bias (the technology is more adept in classifying males than females, and it performs better on people with lighter skin than those with darker skin)—in other words, because of its potential inaccuracy, in China we must raise concerns because of its frightening efficiency.
As Paul Mozur has reported,
While facial recognition technology uses aspects like skin tone and face shapes to sort images in photos or videos, it must be told by humans to categorize people based on social definitions of race or ethnicity. Chinese police, with the help of the start-ups, have done that. (Mozur, 2019)
The reason for the high level of classification has little to do with safety and security; rather, its basis is social control, discrimination and abuse of power. The Chinese scenario illustrates that the problem with the usage of facial recognition in the real world arises not only when the technology proves to be inaccurate—for example, when it could lead law enforcement to identify the wrong person as a suspect in a crime—but also when the technology itself appears to be particularly efficient and available to governments or unscrupulous and unregulated public or private organisations. The beginning of a dystopian society, which already exists in China, is not a remote probability for the globalised society in which we live, considering that the potential misuse of these technologies is not the privilege of governments, but of companies willing to sell them in exchange for profit. Furthermore, the raw material—the data images—that makes the technology work is becoming increasingly efficient to produce; additionally, its production is voluntarily and largely unconsciously provided by unsuspecting users.
Whether the classification, tagging or embedding of metadata into digital images is voluntarily performed by everyday users sharing selfies on social networks or added subsequently by a sophisticated convolutional neural network or whether the images are enthusiastically shared on a social platform with close friends or maliciously captured by hidden surveillance cameras, satellites or CCTV, the images are stripped of their original nature. Transformed into data images, images now function as a ‘vast database of indexed photographs, which can be remixed and remapped’ (Rubinstein and Sluis, 2008, p.13). The transformation of images into data images not only obliterates the individual value of the image and relegates its meaning according to whether it belongs to a larger group of similar images, but also delegates to a machine trained to read the images our ability to judge, our arbitrariness and our ability to disagree. If we are becoming accustomed to believing a machine when it tells us that millions of disparate photographs classified as ‘dog’ cannot represent anything other than a dog, then we are also ceasing to question whether the machine may be wrong. It would not be long before we would believe, or would be convinced to believe, that someone classified by a machine as a criminal must be a criminal or that any threat to society must be such because the machine classified it so.
In anticipation of current events, Paul Virilio wrote,
Once we are definitively removed from the realm of direct or indirect observation of synthetic images created by the machine for the machine, instrumental virtual images will be for us the equivalent of what a foreigner’s mental pictures already represent: an enigma. (1994, p.60)
When we rely on technology to understand, classify and judge the world, entrusting it with the role of a neutral and independent referee, we persist in the enduring fantasy that machines can reduce human error. By blindly believing the machine's judgement of the image, we risk becoming blind in our capacity for individual judgement, in our capacity to disagree, to disobey or simply to act and think differently from the mainstream narrative.
Understanding and relying on the machine
If it is true that ‘Every photograph is, in fact, a mean[s] of testing, confirming and constructing a total view of reality. [. . .] Hence the necessity of our understanding a weapon which we can use and which can be used against us’ (Berger, 1972, p.182), then the combination or, should we say, the collaboration between images and AI requires an additional level of alertness regarding the usage of these technologies. ‘Arranging the world from the perspective of the machine renders it computationally efficient but makes it completely incomprehensible to humans. Moreover, it accelerates their oppression’ (Bridle, 2018, p.116). The problem is not merely in the accuracy or otherwise of the system—as described above, it is more worrying when the system reaches high levels of efficiency—rather, it lies in its functioning and lack of transparency. On the one hand, the raw material that feeds the system, the data images, are no longer the individual images that we are accustomed to scrutinising, observing, and analysing, but a flow of data ‘made by machines for other machines, with humans rarely in the loop’ (Paglen, 2019, p.24); on the other hand, the algorithmic system that processes them ‘is so complicated that even the engineers who designed it may struggle to isolate the reasoning for any single action’ (Knight, 2017).
In a report published by the Authority of the House of Lords in 2018, the Committee on Artificial Intelligence introduced the idea of intelligible AI (2018, p.38) to create accountability for the outcome of AI system decisions, but concurrently admits ‘that achieving full technical transparency is difficult, and possibly even impossible, for certain kinds of AI systems in use today’ (2018, p.38), and that,
it is not acceptable to deploy any artificial intelligence system which could have a substantial impact on an individual’s life, unless it can generate a full and satisfactory explanation for the decision it will take. In cases such as deep learning neural networks, where is not yet possible to generate a thorough explanation for the decisions that are made, this may mean delaying their deployment for particular uses until alternative solutions are found. (Select Committee on Artificial Intelligence, 2018, p.41)
Admitting the complexity of the issue is undoubtedly a step forward in the regulation of such technologies, but it is also evident that the political, social and commercial interests at stake do not make the task simple. As discussed in the previous chapter, these technologies are not only the prerogative of governments and states but also of companies and organisations, which, as in the past, protect their interests behind black box systems that prevent the scrutiny of their methodology, blaming complexity and intellectual property protection for the secrecy. Whether the complexity and obscurity are intentionally maintained by the organisations that design and deploy it or innate to the system itself, this should certainly and in no way make us less vigilant. On the contrary, if we embrace Agamben’s view for whom the apparatus is ‘literally anything that has in some way the capacity to capture, orient, determine, intercept, model, control, or secure the gestures, behaviours, opinions, or discourses of living beings’ (2009, p.6), then we should be concerned about the implications of the usage of such technologies because the apparatus that implements them—be it technical, political or both—makes it difficult, if not impossible, to understand, dismantle and challenge them.
The black box that hides the decision-making processes of AI, computer vision and machine learning, in combination with the data images, is not so different from the mystery that once hid the magic of the camera obscura and, shortly thereafter, the photographic apparatus. As Flusser noted in his study about photography,
Apparatuses are black boxes that simulate thinking in the sense of a combinatory game using number-like symbols: at the same time, they mechanize this thinking in such a way that, in the future human beings will become less and less competent to deal with it and have to rely more and more on apparatuses. Apparatuses are scientific black boxes that carry out this type of thinking better than human being[s] because they are better at playing (more quickly and with fewer errors) with number like symbols. Even apparatuses that are not fully automated play and function better than the human beings that operate them. (2005, p.32)
Over time we have learned to doubt the promising ambitions of photography, and we have also educated ourselves to appreciate the possibility of learning to recognise and criticise the misleading messages that photography can convey. We have also developed a healthy sense of suspicion in recognising that the ambiguities produced by the photographic image reside in both the apparatus that produces them and the viewer’s judgement. In other words, we have learned to have a critical sense of technology and to understand that it does not protect us from the possibility of human error. What we are encountering now is a new apparatus that combines the complexity of calculus with the power of images. The promises of this new technology are relatively similar to the promise that photography and Talbot’s first plates made: accuracy of representation and an almost-divine, objective truth. We have learned that photography is neither objective, accurate nor divine, and similarly, this new ‘technology does not emerge from a vacuum. Rather is the reification of a particular set of beliefs and desires: the congruent, if unconscious dispositions of its creators’ (Bridle, 2018, p.142).
Concluding Thoughts
The question raised in this paper is not intended to discredit or dismiss the usage of machine learning and images. Despite my argument, the technology has proven to produce opportunities for those who have the skill and determination to utilise these new technologies for investigation. I recently participated in a conference at National Geographic Society in which a large number of scientists, geographers, conservationists, photographers and explorers were enthusiastic about the possibilities offered by the analytical abilities of computer vision and machine learning on images to, for example, defend protected areas from deforestation or track and study endangered species. Even in areas of application which concern the investigation of social justice, human rights violations or environmental issues, these new data technologies offer valuable opportunities and may be utilised to challenge mainstream narratives. Nevertheless, the complexity of the system and the potential misuse of such technologies must be seriously considered. If data is effective in ‘abstracting the world into categories’ (Kitchin, 2014, p.1) and machines cause us to believe that we are exonerated from the risk of making mistakes, then the society we are destined to inhabit will be categorised and classified with no room for error or disagreement.
The problem will not be making mistakes, which could create opportunities for human learning that will be carefully prevented by machines. The problem will be adapting our behaviour for fear of making mistakes in the first place, and for this to be judged, accused, imprisoned or simply laughed at. ‘Excessive control and perfection too close at hand, the absence of spontaneity and the abolition of the incident, seem virtuous, but end up becoming distorting’ (Fontcuberta, 2012, p.202, my translation from Italian). Fontcuberta wrote these words in the final pages of his book about the shift from analogue to digital photography to describe criticisms attributed to digital photography with respect to analogue photography. I utilise them here as a metaphor to conclude and summarise my analysis of the transformation of photographic images into data images and their usage as fuel to feed the systems of machine learning in the practices of societal control. If it is important to predict a nuclear attack by the automated system of satellite imagery, to recognise a potential terrorist in the middle of a crowd via facial recognition software or identify the patterns of glacial melting—or even to recover a lost backpack—then it is also important that we understand, argue and potentially recognise whether these conditions are correctly attributed, may be misleading or were fabricated with malicious intention.
As described in this paper and as photography in the past has demonstrated, technology does not protect us from making mistakes; on the contrary, in shaping our future, AI may become the tool in the hands of few that will diminish our ability to be critical. As a practising photographer, I have questioned and continue to question the capabilities of photography and images, their usage and ambiguity. It is also this ability to interpret with a critical sense, even when we err, that makes us thinking and conscious animals. Consigning to machines our ability to interpret will not shield us from making mistakes but will relegate us to the passive role of irrelevant, controlled and manipulated observers.
Notes
[1]In 'The Work of Art of Mechanical Reproduction, ' Walter Benjamin illustrates the cultural leap from the manual reproduction of art to the mechanical one. In particular, with the invention of photography, the practices of mechanical reproduction of the work of art bring a real mutation of the very idea of art, in which the reproduced work no longer hasanything transcendental, has lost its 'aura' and becomes, therefore, contingent like everything else and like every commodity. The cultural value of the work is replaced by the display value. Although this is, on the one hand, a process of democratization, on the other hand, it produces the affirmation of a type of fruition no longer based on contemplation but on distraction. The subsequent acceleration of technology and digital photography has accentuated this process effecting the way we consume and perceive images.
[2]The camera obscura (Latin for ‘dark chamber’) consists of a closed device, box or room, in which a hole is drilled through which the light enters and is reflected on a projection plane, generating an upside-down image of what is located outside the room. Darkroom technology has been utilised by painters and scientists since Aristotle's time, but widely since the 16th century. Considered the ancestor of photography, it was also the first tool (apparatus) to produce technical images.
[3]‘ImageNet is an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or word phrases, is called a "synonym set" or "synset". There are more than 100,000 synsets in WordNet, [the] majority of them are nouns (80,000+). In ImageNet, we aim to provide on average 1,000 images to illustrate each synset. Images of each concept are quality-controlled and human-annotated. In its completion, we hope ImageNet will offer tens of millions of cleanly sorted images for most of the concepts in the WordNet hierarchy.’ Source ImageNet website: http://www.image-net.org
[4]The Chinese social credit system (announced by the government in 2014) is the development of the verification system based on the analysis of data commonly used by private and public companies to verify for example the reliability of a creditor. The Chinese system expands this idea to every aspect of life, judging and evaluating the public and private behaviour of potentially every individual. When implemented at the national level, each citizen will be assigned an identification number that will collect all the digital information retrieved on his account by public or private organizations. Based on these constantly updated data, the automated system managed by the government will be able to establish a score for each individual citizen. As for the credit score, the social score may increase or decrease according to each citizen behaviour. The methodologies and criteria are not public, but, for example traffic offences, illegal use of the internet, behaviour considered anti-government or smoking in non-smoking areas can lead to restrictions on the purchase of train tickets, the possibility of travelling freely and even to detention.
[5]In the article, ‘Facial Recognition Tech Is Growing Stronger, Thanks to Your Face’ which appeared in The New York Timeson July 13, 2019, journalist Cade Metz reports numerous abuses perpetrated by U.S. technology companies as well as university and governmental bodies in collecting and building facial image datasets without the consent of millions of unconscious users.
[6]Forensic architecture is an example of a successful multidisciplinary agency of investigation which utilises computer vision, machine learning and data visualisation technology for the realisation of its investigative and artistic projects. Based at the Goldsmith University of London, they ‘undertake advanced spatial and media investigations into cases of human rights violations, with and on behalf of communities affected by political violence, human rights organizations, international prosecutors, environmental justice groups, and media organizations’ (Forensic Architecture Agency, 2019). For more information about their project, it is possible to read the book ‘Forensic Architecture: violence at the threshold of detectability’ by Prof. Eyal Weizman, the founding director of the agency.
Bibliography:
Agamben, G. (2009) ‘What Is an Apparatus?’ and Other Essays. Meridian, Crossing Aesthetics. Stanford, Calif: Stanford University Press. [online]. Available from: http://search.ebscohost.com/login.aspx?direct=true&db=nlebk&AN=1131333&site=ehost-live.
Anon (2019) Daily Satellite Imagery and Insights [online]. Available from: https://planet.com/ (Accessed 1 June 2019).
Barry, E. (2018) From Mountain of CCTV Footage, Pay Dirt: 2 Russians Are Named in Spy Poisoning. The New York Times. 1 November. [online]. Available from: https://www.nytimes.com/2018/09/05/world/europe/salisbury-novichok-poisoning.html (Accessed 6 June 2019).
Barthes, R. (2010) Camera lucida: reflections on photography. Reflections on photography. New York: New York: Hill and Wang.
Benjamin, W. (2008) The work of art in the age of its technological reproducibility, and other writings on media. Cambridge, Mass.: Cambridge, Mass.
Berger, J. (1972) Selected essays and articles: the look of things. New York: Viking.
Beshty, W. (2018) Picture Industry: A Provisional History of the Technical Image 1844-2018. 1st edition. Arles: SAS LUMA.
Biber, K. (2005) Photographs and Labels: Against a Criminology of Innocence. Law Text Culture. 10. [online]. Available from: http://ro.uow.edu.au/ltc/vol10/iss1/4 (Accessed 8 October 2019).
Bridle, J. (2018) New Dark Age: technology and the end of the future. London: Verso.
Chun, R. (2018) China’s New Frontiers in Dystopian Tech [online]. Available from: https://www.theatlantic.com/magazine/archive/2018/04/big-in-china-machines-that-scan-your-face/554075/ (Accessed 12 July 2019).
CNN, Z. C. and N. G. (2019) Exclusive: Images show North Korea missile launch as Pyongyang tests Trump [online]. Available from: https://www.cnn.com/2019/05/05/politics/north-korea-missile-launch-image/index.html (Accessed 16 August 2019).
Cotton, C. et al. (2010) Words without pictures. New York: New York: Aperture.
Diamond, A. M., Larry (2018) China’s Surveillance State Should Scare Everyone [online]. Available from: https://www.theatlantic.com/international/archive/2018/02/china-surveillance/552203/ (Accessed 12 July 2019).
Ding, J. & Gong, E. (2019) ‘Exploring Uses of AI and Data in China Today’, in AI - More Than Human. London: Barbican International Enterprise. pp. 137–139.
Eder, J. & Klonk, C. (2017) Image operations: visual media and political conflict. Manchester, UK: Manchester University Press.
Eyal Weizman ‘“Forensic Architecture: Only the Criminal Can Solve the Crime” - Chapter’, in Eyal Weizman (ed.) The least of all possible evils: humanitarian violence from Arendt to Gaza. London: Verso. pp. 98–136.
Farocki, H. (2004) ‘Phantom Images’, in Walead Beshty (ed.) Picture Industry: A Provisional History of the Technical Image 1844-2018. 1st edition Arles: SAS LUMA. p.
Flusser, V. (2005) Towards a philosophy of photography. 4th edition. London: Reaktion.
Fontcuberta, J. (2012) La (foto)camera di Pandora. La fotografi@ dopo la fotografia. Roma: Contrasto.
Forensic Architecture Agency (2019) Forensic Architecture → Agency [online]. Available from: https://forensic-architecture.org//about/agency (Accessed 8 August 2019).
Grundberg, A. (1999) Crisis of the real: writings on photography. 3rd ed. New York: New York: Aperture.
Hayles, K. (1999) How we became posthuman: virtual bodies in cybernetics, literature, and informatics. Chicago, Ill: University of Chicago Press.
Henning, M. (2018) IMAGE FLOW: Photography on tap. Photographies. 11 (2–3), 133–148.
Horwitz, J. (2018) The billion-dollar, Alibaba-backed AI company that’s quietly watching everyone in China [online]. Available from: https://qz.com/1248493/sensetime-the-billion-dollar-alibaba-backed-ai-company-thats-quietly-watching-everyone-in-china/ (Accessed 12 July 2019).
Keefe, P. R. (2016) The Detectives Who Never Forget a Face. [online]. Available from: https://www.newyorker.com/magazine/2016/08/22/londons-super-recognizer-police-force (Accessed 6 June 2019).
Kember, S. J. (2008) THE VIRTUAL LIFE OF PHOTOGRAPHY. Photographies. 1 (2), 175–203.
Kitchin, R. (2014) The data revolution big data, open data, data infrastructures and their consequences. Los Angeles, USA: Los Angeles: SAGE.
Knight, W. (2017) The Dark Secret at the Heart of AI [online]. Available from: https://www.technologyreview.com/s/604087/the-dark-secret-at-the-heart-of-ai/ (Accessed 6 August 2019).
Krtilova, K. (2016) Can We Think Computation in Images or Numbers? Critical Remarks on Vilém Flusser’s Philosophy of Digital Technologies. Flusser Studies. 22. [online]. Available from: http://www.flusserstudies.net/sites/www.flusserstudies.net/files/media/attachments/krtilova-can-we-think.pdf (Accessed 13 April 2019).
Learned-Miller, E. G. (2011) Introduction to Computer Vision. Department of Computer Science.University of Massashusetts. p.11.
Mäkinen, E. (2009) Introduction to Computer Vision from Automatic Face Analysis Viewpoint.
Manovich, L. (1997) ‘Automation of Sight: From Photography to Computer Vision’, in 1997 University of California at Riverside 1994. p. 22.
Marr, B. (2017) How Much Data Do We Create Every Day? The Mind-Blowing Stats Everyone Should Read. Forbes. [online]. Available from: https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/ (Accessed 30 May 2019).
Michel Foucault (1998) ‘“Right of Death and Power over Life” - Chapter’, in Michel Foucault (ed.) The history of sexuality: Vol.1: The will to knowledge. London: Penguin Books. pp. 135–145. [online]. Available from: http://solomon.soth.alexanderstreet.com/cgi-bin/asp/philo/soth/documentidx.pl?sourceid=S10021790.
Mitchell, W. J. T. (2005) What do pictures want? The lives and loves of images. Chicago, Ill.: Chicago, Ill.: University of Chicago Press.
Mitchell, W. J. T. (1984) What Is an Image? New Literary History. [Online] 15 (3), 503–537.
Monteleone, D. (2019) Samuel and the Others. Countermapping Contemporary Migration. Unpublished.
Moshakis, A. (2018) Super recognisers: the people who never forget a face. The Observer. 11 November. [online]. Available from: https://www.theguardian.com/uk-news/2018/nov/11/super-recognisers-police-the-people-who-never-forget-a-face (Accessed 9 July 2019).
Mozur, P. (2018) Inside China’s Dystopian Dreams: A.I., Shame and Lots of Cameras. The New York Times. 8 July. [online]. Available from: https://www.nytimes.com/2018/07/08/business/china-surveillance-technology.html (Accessed 1 July 2019).
Mozur, P. (2019) One Month, 500,000 Face Scans: How China Is Using A.I. to Profile a Minority. The New York Times. 14 April. [online]. Available from: https://www.nytimes.com/2019/04/14/technology/china-surveillance-artificial-intelligence-racial-profiling.html (Accessed 1 July 2019).
Paglen, T. (2019) Invisible Images: Your Pictures Are Looking at You. Architectural Design 89 (1) p.22–27.
Raji, I. D. & Buolamwini, J. (2019) ‘Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products’, in Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society - AIES ’19. [Online]. 2019 Honolulu, HI, USA: ACM Press. pp. 429–435. [online]. Available from: http://dl.acm.org/citation.cfm?doid=3306618.3314244 (Accessed 12 July 2019).
Ramon, M. et al. (2019) Super-recognizers: From the lab to the world and back again. British Journal of Psychology. [Online] 0 (0). [online]. Available from: https://doi.org/10.1111/bjop.12368 (Accessed 9 July 2019).
Ritchin, F. (2009) After photography. New York: New York: Norton.
Rubinstein, D. & Sluis, K. (2008) A LIFE MORE PHOTOGRAPHIC: Mapping the networked image. Photographies. 1 (1), 9–28.
Russell, R. et al. (2009) Super-recognizers: People with extraordinary face recognition ability. Psychonomic Bulletin & Review 16 (2) p.252–257.
Saugmann, R. (2018) ‘Surveillance’, in Visual Global Politics. London: Routledge - Taylor and Francis. pp. 288–293.
Sekula, A. (2016) Photography Against the Grain - Essay and Photo Works 1973-1983. 2nd edition. London: MACK Book.
Select Committee on Artificial Intelligence (2018) AI in the UK: ready, willing and able. p.183. [online]. Available from: https://publications.parliament.uk/pa/ld201719/ldselect/ldai/100/100.pdf (Accessed 6 August 2019).
Shim, D. (2018) ‘Satellites’, in Visual Global Politics. London: Routledge - Taylor and Francis. pp. 263–271.
Simon, T. (2003) The Innocent. 2nd edition. New York: Umbrage Edition Inc.
Simon, T. (2000) The Innocents [online]. Available from: http://tarynsimon.com/works/innocents/#1 (Accessed 6 July 2019).
Simon, T. (2002) The Innocents - Larry Mayes. [online]. Available from: http://tarynsimon.com/works/innocents/#6 (Accessed 16 August 2019).
Snyder, J. (2004) ‘Res ipsa loquitur’, in Things that talk object lessons from art and science. New York: Zone Books. pp. 195–221. [online]. Available from: https://www.researchgate.net/publication/41553650_Res_ipsa_loquitur (Accessed 7 July 2019).
Stallabrass, J. (2013) Documentary. London: Cambridge, Mass.: London: Whitechapel Gallery.
Steinmann, K. (2011) Apparatus, Capture, Trace: Photography and Biopolitics. Fillip [online]. Available from: https://fillip.ca/content/apparatus-capture-trace-photography-and-biopolitics (Accessed 14 July 2019).
Talbot, W. H. F. (1844) ‘Introductory Remarks from Pencil of Nature’, in Picture Industry: A Provisional History of the Technical Image 1844-2018. 1st edition Arles: SAS LUMA. pp. 36–40.
Virilio, P. (1994) The Vision Machine. London: Bloomington: Indiana University Press.
Wiener, N. (1961) Cybernetics: or control and communication in the animal and the machine. Control and communication in the animal and the machine. 2nd ed. Cambridge, Mass.: Cambridge, Mass.: M.I.T. Press.
Wiener, N. (1989) The human use of human beings: cybernetics and society. London: London: Free Association.
Zins, C. (2007) Conceptual approaches for defining data, information, and knowledge. Journal of the American Society for Information Science and Technology. [Online] 58 (4), 479–493.
Zuboff, S. (2015) Big other: Surveillance Capitalism and the Prospects of an Information Civilization. Journal of Information Technology. [Online] 30 (1), 75–89.