I'm currently writing a program to automate the setup and configuration of a complex linux/SAP/benchmark system on a three-system setup (DB/BW/benchmark driver)
I can do just about everything with standard Linux commands, but at one point, I needed to run a BW process chain.
Not easy from within a program - especially since it's a naked BW installation, so I couldn't set the chain to run on event and then create a file in some directory (or some other ugly hack)
So what to do?
The answer as usual comes from SAP themselves. They provide a NetWeaver SDK, with a sample program called startrfc. It allows you to run one of two function modules (EDI stuff, of no interest to anyone), passing parameters.
A handful of code changes, a quick SAP Notes search to find the appropriate gcc commands to rebuild, another search to find some specific perl file used in the compilation, and bam...a new linux command which allows me to run the RSPC_API_CHAIN_START function module, passing the name of the chian I want to run.
Game over, I win
Up until now, when creating a new job, the wizard would check with the different engines (backup and storage) each parameter to see if the value was OK with the engine.
This is all well and good, but not sufficient.
Imagine an FTP storage engine. It's a good idea to check that the host name entered does actually exist, but it makes more sense to check if we can connect with the username and password entered to the hostname on the port specified. This needs access to all four parameters at once, which wasn't available up until yesterday.
Now, while still checking individual parameters as before, there is a final check of all parameters together at the end.
Of course, an engine can implement or not either of these checks - deciding whether to check individual parameters or the whole parameter set. Or both. Or neither.
This weekend I also implemented the option of an engine requiring no parameters. This is true for the 'Copy Backup' backup enigne, which used to ask for a 'job name' field - which has no place in there at all. So I took it out, and the engine now requires no parameters. It has broken something in the backup worker though. Will have to look into that.
Spent a very busy weekend, but am happy with the results
Have added some new features to the system, such as the 'first run' wizard shown below:
I've also added a fair bit of user interface improvements. For example, the list of engines available, which used to look like this when creating a new job:
...now looks like this in the new wizard (and soon to be improved in the main job creation wizard too) :
Note how engines can now have an icon (simply a resource in the dll with type of Bitmap and name of ICON). The version and copyright are read dynamically from the dll too.
This information and UI improvements can of course be reused elsewhere:
There's a way to go before I can call this application 'pretty', but it's an improvement!
Incidentally the code behind all of this is a lot more modular too, which considerably less dependencies, so I can reuse it just about anywhere. Of course, it's all generic too, depending very little on what the dll is 'capable' of.
Oh, one small detail - backups can now handle empty directories (previously the backup was of files, the directories were only attributes of the file). Still need to handle the directories in the restore process, but the information is saved.
After several months offline (changing internet service providers is not usually a good move...) we're back online
The backup software has made serious progress - backup and restore can now be done flawlessly, either archive or copy, to a local drive
FTP, S3 etc need to be tested
OneDrive and other cloud providers (box, etc) can be added easily
Improved the user interface for the storage providers with better descriptions, icons etc. Basically the DLL can now provide more comprehensive info to the user
If the dlls for backup and storage are all in teh same directory (as they are when I'm testing), then only the appropriate type (backup engine or storage engine) will be shown, depending on the action being done in the main gui (job create or restore, etc) This was a big pain for me, am happy to have it done
I think I'll polish what currently exists and release it as-is, which will allow me to work on more stuff later. Otherwise it'll never get released
The imdb project is on hold. I have had no real time to work on this
My raspberry pi 2 has finally been installed. I purchased a tontec 3.5" lcd screen for it, and it now boots directly into X, logs in automatically, and runs a flip clock application in fullscreen on boot. So my raspberry is a somewhat expensive flip clock...
But I have plans for it for later (backup-related, of course)
My next project for it will be making the OS reside on an iSCSI target (my QNAP), which involves rebuilding the kernel (yay!). I'll look at that over the weekend maybe
I'm still playing round with my pseudo-database. Basically it can now load in data from the disk-persistence, run some query ("select __rownum from table where col1 = 'GBP'") and then use the results from that first subquery to give some useful response ("select custname from table where __rownum in (x, y, z)").
Columns currently have two types - integer or string, and string columns can have a specified length. This allows me to emulate most of the datatypes supported by sql. Each type of column can currently 'decide' how to store its data, and that's (still) where things get complicated.
In order to do lookups, I basically need two functions per column type - getRownumsByValue and getValuesByRownum. In other words, I need to be able to do lookups by both key and value. Lookup in an array, by id (rownum) is easy to code - it's just array[id] - and the results are found relatively fast.
However, doing lookups by value this way means walking the array, which is not doable. I can store the information differently (a dictionary where each "id" is a string value, and the associated "value" is a list of rownums). It means I convery the list of key=value into something like value=keys. Doing a lookup by value is now very fast (a dictionary is a hashed tree), but doing a lookup by rownum is now slow again...
Dictionaries do have the advantage of compressing the data (by interning the strings, as it were). It is however, rather impressive the amount of memory a list of row numbers can take up. Once you have more than 65536 rows in your column, a rownumber takes up 4 bytes, which is quite a lot. Even with compression of the data, a string column ends up using around twice as much memory as the original file of strings.
Of course, the fastest lookups would mean storing the data in both key=value and value=key form.
Not the best way of doing things, but very fast (lookup by value in 1000000 lines = 0ms to find 333333 lines. Lookup by rownum of the 333333 lines to find another 333333 values = 16ms. I.E. select custname from table where currency = 'GBP' is just under 20ms, for a table of 1M lines...I'm quite happy with that