I recently developed functionality to allow an app's users to sign in using a business card. The business card has an embedded batteryless, short-range communication device known as a Near Field Communication (NFC) tag. The user simply positions the business card close to their mobile device (usually top rear) and data is transferred from the static , wireless tag to the mobile device. Power is transferred from the mobile device to the NFC tag when placed in close proximity thus the tag containing the data does not require its own power supply. I have received a number of messages asking me to document the process.
The app combines artificial intelligence and gamification to acquire user submitted source images to enhance the underlying machine learning object recognition dataset.
The app protoype uses a fairly limited model with just under a hundred recognisable classes of object; ranging from a computer mouse to a fridge, a person or even a toilet!
To help improve the number of recognisable objects, the app asks the user to share photographs of objects it could not guess during the game.
I Spy – the original version
The app is based on the popular game I Spy, players take turns guessing the identity of a nearby object. The challenger provides the first letter of an object as a clue to help other players. If nobody guesses correctly the challenger wins the round.
AISPY – A ghost in the machine
At the heart of the app’s functionality is the ability to ‘recognise’ objects in the real world and remember them throughout the game. Although the classic version of I spy is a kid’s game, this app is aimed at grown ups.
It uses a machine learning platform called TensorFlow – “a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.”
AISPY uses the device’s camera to stream images. It captures frames as the user moves the device around the environment. Each frame is then processed and the app stores a list of objects it has recognized within memory, along with the types of object and frame positioning using bounding boxes.
From this data, the app is able to challenge the human player to guess a spied object. When it is the human player’s turn to challenge, the app draws upon it’s recent inventory of recognized objects from local storage to guess the human player’s object based on the first letter they provided as a clue.
The interaction between the AI player (the device!) and the human player is carried out almost entirely by vocal speech recognition and in return the app uses the devices built in speech capabilities to speak.
Computer vision provides an exciting tool for creative app developers. However, Artificial Intelligence is still a way off achieving our levels of visual perception.
“It is amazing that humans and animals do this so effortlessly, while computervision algorithms are so error prone”– Richard Szeliski ( Microsoft Research)
Machine learning mishaps in the realworld would be cause for alarm. But, within AISPY, it provides an element of lighthearted humour, especially when the app mistakes an object for something utterly absurd !
I have played upon the humourous aspect further, by providing the AI player with quirky, charismatic responses, and utterances throughout the game.
Microsoft COCO Dataset
The current version of AISpy uses a model created by Microsoft COCO (common objects in context). The process of using a pre-trained model in this way is call Transfer Learning.
The COCO dataset includes 80 object types derived from over 200 thousand labelled images.
The associated research paper describes coco as a dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding.
AISPY seeks to build upon the COCO dataset’s 80 or so classified object types by requesting users to submit a copy of the object from a game when the AI player fails to guess correctly. The app saves two images – a contextual image of the object in-situ as well as a clipped version. This means bounding boxes may be used to identify the object within the image and this will then contribute to the overall dataset and eventually allow new object types to be added to the app’s trained model.
The object recognition functionality is limited, due to the small number of available pre-trained datasets. So, I have purposely lowered the threshold for accuracy to ensure the game identifies more objects. I have sought to play upon the humour of mistaken identifications alongside the playful nature of the AI character’s vocal feedback during gameplay.
Overall, it has been an interesting experience working with TensorFlow and Flutter. I am hoping that I can expand upon the trained data model with images collected through this app. The model will then be useful for future projects which require realtime object recognition with a greater number of recognisable objects.
The second-annual international Flutter hackathon, organised by the global Flutter Community, took place over the weekend 27/28 June 2020.
The 48 hour app jam invites teams of up to five members to build an app using Flutter - Google’s UI toolkit for building natively compiled applications for mobile, web and desktop from a single codebase.
The purpose of this article is to explore how elements of UI (user interface) design evolved from the analogue world of physical, three dimensional switches, dials and buttons. It also examines potential current and future roles for skeuomorphs (digital components which look like real-world objects) within the field of software and product design.
I am reviewing the first chapter of Douglas Brown's thesis on the suspension of disbelief in video games. I chose this chapter for analysis because I wanted to understand the foundations upon which the thesis is based - the case for games as a distinct medium. One of my own research interests is the nature of VR (virtual reality) and I am interested in the implications of Brown's conclusion within the field of VR creativity.
Within this post I will provide a step by step guide to create a Google Cloud Firestore driven contact form for a Flutter web app; including details about how to build the form, validate the input and set up security rules before finally saving the data to the Google Firestore database.
I recently decided to test drive Google’s ARCore Extension for the Unity game engine to find out how Cloud Anchors work. Cloud anchors allow AR (Augmented Reality) apps to create virtual objects within a physical space and persist them across multiple user sessions. This means that different users should be able see and interact with the same virtual objects at the same time from an AR app.
John is a french baker. He has been baking bread in his village bakery alongside his wife for over 30 years. John gets up every morning at 4am to make sure there is plenty of fresh bread on sale when the bakery opens it's doors at 7am. At which time John's wife, Matilda, joins him in the shop to serve the customers. The bakery is very busy every day. John continues to bake bread until early afternoon and then he returns home for some rest and goes to bed at 8pm. Matilda remains at the shop until it closes at 7pm. There must never be any left over bread at the end of the day but at the same time they must never run out either, or else they'll lose customers.
Last summer, I was working on a prototype for an AR app that would allow users to create virtual objects in real world spaces. I carried out the work using the Unity3D game engine. My last commit to the repository was five months ago.
Since those last commits I've been busy working on a few Flutter projects (see previous posts). I now have an new opportunity to go back to the AR project, so I have been reflecting upon how I might combine these two technologies,