Electric Boogaloo

For a long time I’ve known about arcade & console emulation, but I have always assumed the setup was a nightmare that wasn’t worth the payoff. I’ve had friends who did the whole Raspberry Pi thing, and it seemed like it was more of a complaint generator than an NES emulator. But recently I’ve had friends on Facebook who I didn’t necessarily think were all that technically inclined talk about their bartop arcade setup with Raspberry Pi. On the heels of that, I walked into Micro Center in Dallas around Christmastime and saw a display where they had a whole arcade box kit running off a Raspberry Pi 3B+…….I decided it was time to tackle that project, if only on a smaller scale.

….But that’s not what I’m going to tell you about today. Instead, I’m going to tell you how all this emulation talk had my NES weenie vibrating pretty good, and how I solved my burning desire to play Punch Out on my laptop yesterday, and about how easy it was to get it all up and running on my Macbook Pro. It took about a minute. It actually took longer to find batteries for my wiimote.

Step 1: Get Open Emu

Having used retropie on my Pi “build,” I googled “retropie for mac” which is of course a ridiculous question but it directly lead me to something called Open Emu, which is an open source arcade & console emulation interface for MacOS. Hit the download button. It supports tons of systems such as NES, SNES, and any other console you can think of, and also actual arcade games.

Step 2: Unzip that sucker

I made a folder called openEmu and copied the zip I downloaded in step 1 (the actual filename was OpenEmu_2.0.8.zip) into that folder. Then I unzipped it. Truly magic.

Step 3: Run that sucker

From Finder double click the OpenEmu.app. You may have to clear a security hurdle You’ll see the app open up with a navigation bar down the side, and a box inviting you to drag & drop game files. Ah crap, you need game files too?? Here we go with the add-ons…

Step 4: Find you some roms

Roms are essentially everything that was on that game cartridge, only turned into a small file. Some poor programmer’s life’s work, compressed into 100kb. Or from another point of view, hours of childhood entertainment provided by just 100kb of code. Well that’s depressing, isn’t it?

At any rate, usually they have a .zip or .rom extension. They are quite easy to find for just about any system you can imagine. My goal was to play Punch Out, so I had to find NES roms on the line. It is right here where I should probably put some disclaimer about how it is illegal to download a rom of a game for which you don’t already own the cartridge, but I’m not a lawyer and I didn’t even stay at a Holiday Inn Express last night. You should probably figure out what the long arm of the law has to say about all this. But since I bought an NES from FuncoLand in like 1995 and have a box of games (including Punch Out) in a box in my attic that has survived 8 or 9 moves without ever having been opened, I think I’m in the clear.

I’ll assume your lawyer has cleared you to proceed, and that you have also found the roms you were looking for. If your rom is zipped, you’ll need to first unzip it, and then you can simply drag it into OpenEmu for the system you want to play. So navigate to NES, and then drag your unzipped rom from Finder onto the OpenEmu drag/drop box. You should see something like this….obviously my thirst was not quenched by Punch Out alone.

Step 5: Attempt to Play Punch Out

Just doubleclick on the game, and it will fire up the NES emulator and load up your game, without the need for blowing into the cartridge 5 times, then taking it out and re-inserting it just far enough so that it scrapes as you push it down, or whatever other sort of voodoo you used to make your games load after another wasted summer of your youth. The game should load just like it would have if you had an actual NES.

Step 6: Pair your wiimote

Moving the mouse inside the game window will bring up a small menu at the bottom of the screen for things like power on/off, reset, save state (WAT!?), and, as luck would have it, controller configuration. That particular option is under the gear icon dropdown, which I can’t seem to get a screen capture of. When you find it, click Edit Game Controls… which will bring up this fancy screen:

In the lower right, you’ll see a section labeled Input with a dropdown. It probably has Keyboard already selected, but don’t be discouraged. Click on it, and find the option for Add a Wiimote… which will bring up this diagloge box which is takes a more no-frills approach:

Now there is probably an officially correct way to do this, but based on what I read on a couple of other posts, I pressed (and let go of) the red button on the back of the Wiimote, then I held (and did not let go of) buttons 1 & 2, and clicked the Start Scanning button. In just a few seconds the wiimote started doing the sorts of things that drive middle aged housewives crazy (read: flashing and vibrating), and just like that my wiimote was synced.

From that same controller screen you can define your controller key mappings, so the emulator knows which way is up and which button is A, etc. Once you’ve got that bad boy mapped, close the window (red dot, upper left) and you will return to your game.

Step 7: Actually Play Punch-Out

You are now free to re-live your childhood.

This seems like the sort of thing somebody would want to do. I mean, I wanted to do it. I had some JSON messages and wanted to write them to HDFS as Avro to query them in Hive. The thing is it’s not exactly obvious how one might do such a thing in nifi. Or at least it wasn’t obvious to me, so this guide will attempt to demystify all that. It won’t demystify Avro itself, that is left as an exercise to the reader. FULL DISCLOSURE: This is my first blog post, ever.

We’ll start by spinning up a Docker container for nifi:

docker run -it --name nifi -p 8080:8080 apache/nifi

Then in your browser, go to localhost:8080 and the nifi interface should come up. It may take a minute. Your patience will be rewarded.

The first thing we’ll need is some JSON data to turn into Avro. The canonical example of this is to consume the Twitter sample endpoint, but (a) requires getting a developer account and (b) that payload is way bigger than I want to hassle with. So instead let’s drop a Generate Flowfile processor on the canvas and use it to generate some generic data. Open up the Configuration and set the custom text to some sort of valid JSON:

{"foo": "sample data", "bar": "more sample data"}

UpdateAttribute

Before we can convert the JSON to Avro we need to inform nifi of the Avro schema, which we will do with an UpdateAttribute step. The Avro schema will be carried along with the FlowFiles as an attribute for use downstream. In order to add the schema as an attribute, we will add a custom property through the UpdateAttribute processor configuration. Click the [+] to bring up the Add Property modal.

The property name we’re creating will be called avro.schema and the value will be the Avro schema for your JSON. Depending on the complexity of your data, you may be able to use an online JSON-Avro Schema Generator. It is possible for nifi to infer the Avro schema, but it will be blissfully unaware of things like required fields or any other complexities in your schema.

{
  "name": "MyClass",
  "type": "record",
  "namespace": "com.acme.avro",
  "fields": [
    {
      "name": "foo",
      "type": "string"
    },
    {
      "name": "bar",
      "type": "string"
    }
  ]
}

Once you have the Avro schema in hand, add the property, and paste your schema into the value field for that new property.

ConvertRecord

Now we can actually perform the conversion to Avro, using a ConvertRecord step. Nifi handles conversions like this with a Record Reader & Record Writer, each of which are controlled by controller services. For our purposes, we’ll need to create a new reader service using JsonTreeReader 1.10.0. When prompted, give it a descriptive name because nifi doesn’t do a particularly good job of allowing these services to be organized once you get a lot of them.

Once you create the controller service, you’ll need to configure it by click on that arrow next to the Record Reader

This will take you to the Controller Services page, where you can configure the service you just created by clicking on the gear icon.

Set the Schema Access Strategy to Use ‘Schema Text’ Property. Note that the Schema Text property has a value of ${avro.schema}. This means that the reader service will get the schema text from a FlowFile attribute named avro.schema. Hey, that’s what we named the property in our UpdateAttribute step! What an amazing coincidence. Click apply, and then click the lightning bolt to enable the service.

Next we’ll have to create and configure a Record Writer controller service. The steps are the same as the reader service, except that we want AvroRecordSetWriter 1.10.0, configured as such:

At this point if you run your flow, the output from the ConvertRecord step should look something like this, which is the Avro transformed message. You can see that the Avro schema is embedded in the message, which is because that’s what we told nifi to do in the record writer service.

Objavro.schemaŽ{"type":"record","name":"MyClass","namespace":"com.acme.avro","fields":[{"name":"foo","type":"string"},{"name":"bar","type":"string"}]}avro.codecnull�ÉŠKRïc
/B§k‹(Æ“:sample data more sample dataÉŠKRïc
/B§k‹(Æ“

PutHDFS

Now that we have our message in Avro format, let’s write it to HDFS with a PutHDFS step. You’ll need the hdfs-site.xml and core-site.xml files off your Hadoop cluster copied to your running Docker container, and you’ll need a folder created in HDFS to write the files to. The PutHDFS config should look something like this:

Query Avro data in Hive

By starting all our processes we will soon have data written to HDFS in Avro format, which you can see in the Hue file browser or with an hdfs dfs -ls from the command line. So what? You probably want to query that data, which is most easily done by creating a Hive external table. The Hive external table needs 3 things in order to work:

Serializer/de-serializer (aka SerDe)
Location of the avro data
Location of the avro schema

The AvroSerDe comes installed with Cloudera CDH/CDP. The Avro data is in the location specified in your PutHDFS step in nifi. The location of the Avro schema is the one thing we’re missing. Perhaps the easiest way to solve this is to create a new file via Hue file browser one directory level above your avro data. The location doesn’t really matter, but you don’t want it to be co-located with your avro data files. Create the file, call it example_avro.avsc, and then paste the Avro schema text into the file.

CREATE EXTERNAL TABLE example_avro_hive_ext
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED as AVRO
LOCATION '/user/nifi/example_avro_data/' 
TBLPROPERTIES ('avro.schema.url'='hdfs:///user/nifi/example_avro.avsc');

And just like that you can query your Avro data through Hive.

But what about the small file problem?

The astute observer will no doubt notice that as built, our pipeline will write one file in HDFS for each FlowFile, which means one per generated message. That’s fine for our example, but HDFS doesn’t like tons of tiny files which is exactly what we would have if we were consuming live sensor data or the Twitter firehose or whatever. The solution to this is to introduce a MergeContent step before our ConvertRecord step. MergeContent simply takes multiple FlowFiles and merges them into a single FlowFile. Doing it in this sequence puts one instance of the avro schema embedded in each file whereas if we merged content after the ConvertRecord, each record in the FlowFile would have the avro schema embedded. It turns out Hive doesn’t like that so much.

This whole task is rather straightforward in hindsight, and now that I understand it, but it was tough sledding early on. A worthwhile exercise is to take the converted Avro record in nifi and Convert it back into JSON with another set of record reader/writer controller services. Let me know if you found this helpful, I know I wish something like this existed when I was getting started with nifi.

In a future post we will examine the Schema Registry capabilities within CDP, and how it can simplify much of what we built out above.

Useful links:

Generate Avro schema from JSON data

Validate Avro schema

Avro tools

https://repo1.maven.org/maven2/org/apache/avro/avro-tools/1.9.1/

These are good for testing avro from the command line.

java -jar avro-tools-1.9.1.jar

Electric Boogaloo

The time I wanted to play Mike Tyson’s Punch Out on my MacBook.

Writing Avro to HDFS with Apache Nifi