Different Territory? RapidMiner Terminology

This is nerdy language comment. As I mentioned earlier, I’m working to catch up with the RapidMiner Assignments. Going over the tutorials, I found RapidMiner’s terms curiously unfamiliar while the appearance of data structures seemed very familiar. For example, the Preprocessing operator calls a “field”–a common term in database parlance[1]–an “attribute”.

I found in Wikipedia this reference to attributes, Attribute (computing) and a a section which offhandedly explains terms,

Multi-valued databases On many post-relational or multi-valued databases systems, relative to SQL, tables are files, rows are items, and columns are attributes. Both in the database and code, attribute is synonymous with property and variable although attributes can be further defined to contain values and subvalues.

I thought it interesting transformation. Then the ‘light bulb’ goes off in my brain when I realized that ‘attributes’ are ‘fields’.

On Wikipedia I found another page connecting the term ‘attribute’ to the programming parlance of objects. This helps also, because I’m familiar with ‘attributes’ in the programming lexical space.

Funnily enough, Wikipedia’s OOP page says this,

“Object-oriented programming (OOP) is a programming paradigm based on the concept of “objects”, which are data structures that contain data, in the form of fields, often known as attributes; and code, in the form of procedures, often known as methods.”

And there are other examples also. Tableau and RapidMiner refer to repositories and data and ignore the basic terms database and file. I recognized immediately when I stared with Tableau what was going on in the language. But the lower-level terms were tripping me up.

Good thing a filter is not a sieve. Oh! Wait. That’s cooking…

For more on RapidMiner terms see their terminology video (12m).

Here’s another example: confidence == probability

“In the original data set, there were 271 cases where this prediction was correct, and 2 cases where it was wrong. So the confidence this prediction is (271)/(271+2) = 271/273 = 99.27%.” [RapidMiner Walkthrough: Step 15


[1] At least in SQL databases and desktop applications such as FileMaker