Project MIDI Control

Fields: Backend Services, Web Applications, Linux

In order to be able to control music effects devices simultaneously and conveniently, I built a solid control computer with the help of my father's expertise as a passionate amateur musician who plays an acoustic guitar with sound pickup and developed it to a mature product that can be further developed in response to new requirements of professional musicians at any time.

The system can be controlled by foot switches as well as by hand.

The impetus for developing a new product was the lack of a control system on the market that offers a clear user interface and the following key features:

different MIDI command types (e.g. program change and bypass change) in one preset
central configuration of the individual MIDI data per effects unit that can be accessed uniformly through all control presets

Technical Components:

Input Devices: Tablet PC (e.g. iPad), MIDI Foot Switch
Single Board Computer: central control device (e.g. Raspberry Pi)
- WiFi Hotspot for the Tablet PC
- USB Connection for the MIDI Interface
USB⇄MIDI Interface: e.g. MIDISPORT 2x2
Effects Units

The guitar image stems from
Vectorportal.com, CC BY.

Further Details:

Frontend on the iPad to work with the finest touch-screen hardware
- top user experience
- any other tablet and even the smartphone can be used for operation
MIDI standard-compliant communication allows you to control any effect devices
- input: control devices
- output: effects units
Various effects scenarios can be prepared for release in the form of control presets through the user interface on the tablet PC.
User settings are stored in simple JSON files
- human-readable
- quick backup/restore
- easy versioning
Platform-independent backend thanks to Java Virtual Machine
- in addition to the Linux-based control computer for which the system was developed, Windows and MacOS-based computers can also be used for operation
always in trouble-free operation since 2018

Rock Solid Product

Originally written in Angular.js, the source code is still well maintained, maintainable, and actively developed today, just as it was on day one. This means it meets the important requirement of longevity and has already survived countless Angular versions without a complex migration from Angular.js to the newer Angular ever being necessary. Precious time did not have to be wasted due to Google's capriciousness. Upgrade-resistant code is silver 🥈 (and absolutely necessary), but beyond that, long-lasting code is, in my opinion, gold 🥇, because it not only meets the requirement of technical flexibility for regular version upgrades of the program libraries in use, but has also remained understandable over the years.

For even more reliability, the program code for the Java Virtual Machine was developed using the straightforward Scala programming language, which accommodates the program authors with clever rules. This makes unintentional malfunctions less likely.

Many exciting Experiences and Insights

The diversity of the work on the project ranges from processing hardware-related MIDI messages in the backend code to psychological aspects of the product's usability.

Highly Resilient Geographic Map App

Fields: Cloud Computing, Online Maps
Goal: Completing my knowledge about Kubernetes
→ Project on GitHub

In this project, I was able to build on my extensive professional experience with Kubernetes, primarily focusing on application deployments and their stable operation in clusters in the cloud. The personal benefit for me was therefore primarily in filling any remaining knowledge gaps.

In preparation for planned map applications with a selection of possible content, I proposed a preliminary project on this topic as part of the Cloud Computing course. The goal was to analyze a quickly deployable, reliable, scalable, and visualizable cloud operation in a collaborative project with two fellow students. Our project proposal can be found → here. Thanks to the dedicated collaboration, we had a ready-to-use result within a few days and were able to prepare the end of the project in the form of a presentation with live elements.

Stable Operation with Kubernetes

The open source product Kubernetes is a universally popular and highly maintainable container orchestration system for backend services. It is a well-documented and, thanks to YAML, familiar system that builds directly on top of IaaS offerings from various cloud providers. Kubernetes enables a largely platform-independent setup because it starts below the PaaS layer and builds a standardized system on top.

In case of a map application, the backend services are:

Database
- PostgreSQL with PostGIS
Tile Server
- Martin
Frontend Server
- Leaflet (JavaScript)

All three services are represented as workloads on three nodes, each with one instance (pod). A node is a virtual or physical machine. Up to two of the three nodes can fail before operations collapse.

However, the probability of such a scenario should be considered very rare, meaning that the nodes are operated so independently of each other that simultaneous failures are extremely unlikely. A good solution for this is to choose multiple availability zones, each containing one node.

The following image shows the distribution of a backend service across the nodes: Each node runs an instance of the service as a pod. The instances are visible externally as a single service with an IP address. HTTP requests to the service are forwarded to one of the three pods using a load balancer, with each pod using the same machine code to process the requests. In our case, this is the Google Cloud HTTP(S) Load Balancer.

Two “health aspects” of the operation were candidates for possible visualizations:

Data Flows between services
- HTTP requests between services
- data throughput between services
State of Security Restrictions of services

The data flows between the services are clearly visualized by the chart provided by Kiali, whereby HTTP requests cannot be viewed individually, but only their statistical properties such as the number of requests per second or their response times.

The content of the queries or the corresponding responses necessary for troubleshooting are not available in the context of the visualization, but could, for example, be taken from operational logs.

The following image shows an active response time filter. The setting rt > 10 ("response time greater than 10") highlights all data connections with a response time of more than 10 milliseconds with a yellow background. This makes it easy to locate bottlenecks affecting data traffic.

We have defined security restrictions, which are used in addition to the usual network separation, encryption and authentication as another important line of defense against attackers, using network policies.

Unfortunately, we were unable to find any visualization solutions for this, which is why this open issue offers potential for the development of new software that addresses this issue and makes effective network policies visible in a clear format so that vulnerabilities caused by misconfigurations can be quickly identified.

Further Experiences

A list of further experiences can be found here: Lessons Learned.

Pervasive Computing: Audio and Motion Data Analysis

Smartphone Sensors, Decision Trees, Regression Analysis, Neural Networks, Data Science

As part of the Pervasive Computing courses at the University of Linz, I was able to explore the interaction between smartphone hardware and various classification algorithms in detail. I worked with the following sensors:

Audio Sensor: Microphone
- recording of vehicle sounds
Video Sensor: Camera
- vehicle recording to manually label the simultaneously recorded vehicle sounds according to predefined vehicle classes
Acceleration Sensors
- recording movement data when walking or using a screwdriver
- acceleration values in three orthogonal directions in three-dimensional space

Vehicle Categories and their Sounds

I was able to use the smartphone's camera app to record the audio and video data. I used WaveSurfer to label the vehicles in the video. The main area of the window shows the audio frequency spectrum over time; the line titled ".lab" shows the labels (example: "Hr" = heavy vehicle coming from right):

In principle, this is a very useful tool for manual labeling, however, it is not intuitive to use and is hardly suitable for use cases where the video track is needed as an additional source of information (you have to have the video open in a separate program window and synchronize it manually). It would therefore be prudent to develop a modern and more versatile replacement for this tool.

A particularly exciting part of the sound analysis was the encoding as Mel-frequency cepstral coefficients, which are also used in speech recognition tasks. I used them to algorithmically classify vehicles based on the voices of their engines and other driving noises. The Python library librosa served me well in this process.

Vehicle Classification: Results

→ Lab Report

Classification by vehicle size was unsuccessful, possibly due to the difficult decisions made during manual labeling as to whether a vehicle was light, medium, or heavy. 43% of vehicles were misclassified. Most vehicles were classified as medium by the multi-layer perceptron, and none as heavy.
The machine classification of vehicles coming from the right or left was significantly more successful: 93% were correctly classified. Of course, there is still room for improvement here, which can be exploited in future projects: A total of only 132 vehicles were used to train the algorithms.

Acceleration Data

For acceleration data recording, I used the app Phyphox, which was completely unknown to me before and makes the use of acceleration data from a smartphone as easy as possible. The raw data of a walk recorded by Phyphox takes the following form:

Applying a low-pass filter results in the following appearance, with repetitions that are significantly more similar. This indicates that the low-pass filter successfully removes higher-frequency noise without destroying relevant information:

This is also reflected in the following correlation comparison. The correlation between the movement of the phone in the hand on the x-axis and the movement of the phone in the trouser pocket on the x-axis was examined.

Regression methods used: linear regression, Gaussian processes, multi-layer perceptron

Unfiltered, the correlation is less than 0.19.
When low-pass filtered, however, the correlation is at least 0.38, i.e. twice as high.

Any movement period in three-dimensional space can also be represented as a frequency spectrum. The spectral representation offers two advantages: First, for machine processing, feature vectors of any length can be easily generated from the individual frequencies (vector length is determined by combining more or less adjacent frequencies); second, the spectra of different movement sequences can be compared at first glance, unlike their temporal representations. However, for the classification of movements based on their frequency spectra, it is also important that the spectra retain their character over time, i.e., always look the same. Thus, I conducted the following experiment and got a good impression:

The lines in the video above connect accumulated frequencies that are close to each other. Viewed from top to bottom, the lines represent the frequency spectra of the following movement sequences:

BLUE smartphone in right pocket
ORANGE smartphone in left pocket
GREEN smartphone in right hand
RED smartphone in left hand

→ Link to the full Lab Report

The video representation brings time into the chart as a third dimension and allows an assessment of the data within seconds, even before the first calculations are made.

In contrast to perspective 3D representation, this video does not run the risk of causing misjudgments due to optical illusions.

Four properties of the movement data are very clearly evident:

When carried in a trouser pocket, a mobile phone produces stronger magnitudes in the higher frequency range than when carried by a hand.
Although the left and right trouser pockets have a similar spectrum, they are still easily distinguishable.
Left and right hands produce a very similar spectrum and are therefore less distinguishable.
The spectral patterns of the acceleration data are sufficiently stable over time to remain distinguishable.

The analysis results of the WEKA tool confirm my observations: After training a multi-layer perceptron, it is able to flawlessly distinguish all movement data originating from the phone carried in the right pocket from that originating from the phone carried in the left pocket. Furthermore, there are no mismatches between pocket and hand. Distinguishing between left and right hands is difficult not only for me, but also for the multi-layer perceptron: There are two mismatches between the right hand and the left hand.

Confusion Matrix output by the WEKA tool at the end of the multi-layer perceptron analysis:

In general, the WEKA tool proved to be a comprehensive and reliable tool for the final analyses after I had pre-examined the data in JupyterLab using Python code in charts and pre-processed it with the help of Python libraries.

Lab Reports

The topics touched upon above are described in detail in my four lab reports. I thoroughly enjoyed working on these topics and found them very motivating. So, a small stimulus is enough to inspire me to continue researching.

1. Lab Report: Vehicle Categorization
- Download the 1. lab report
2. Lab Report: Motion Classification
- Download the 2. lab report
3. Lab Report: Screwing Process Recognition
- Download the 3. lab report
4. Lab Report: Regression Analysis
- Download the 4. lab report

Further Topics

Photo Management, Photo Book Creation and Photo Long-Term Storage

In parallel to my other projects, I'm working on ideas and prototypes for photo and video management in my role as an amateur photographer.