The IBM i still plays a big role in many businesses. As a result, data is often trapped in old systems like IMS and TPF. You can use Python and SQL to extract this information and load it into NoSQL databases like MongoDB or Redis. This will help you modernize your application architectures. The use of NoSQL technologies has become very popular over the last years. Usage of NoSQL databases in production environments has increased from about 32% in 2014 to about 52% in 2016 according to DB-Engines. This is most likely due to the fact that many companies have started to migrate their data out of mainframe and legacy systems like IMS, TPF and VSAM and into modern databases like MongoDB or Redis. There are many reasons why this is a good thing. Using modern databases will improve performance, scalability and availability, support new use cases and allow you to take advantage of schema-less data stores (for example MongoDB). At the same time it allows you to get rid of old systems which are no longer maintained properly – such as IMS or VSAM. However, there are also challenges: extracting data from one system and loading it into another will be time consuming and error prone if you do not have the right tools at your disposal. With Python, SQLAlchemy (a library which enables SQL access) and some plumbing code you can easily extract data from almost any relational database system as well as non-relational ones like NoSQL databases – even with batch jobs! In this article we will show how you can transfer data from an IBM i using Python into a MongoDB database using SQLAlchemy. At the same time we will demonstrate how easy it is work with large datasets using Python’s asyncio module. In order to keep things simple we won’t consider any ETL processes, only extraction out of a relational database system (IBM i) into a document store (MongoDB). As always our source code is available on GitHub for free under an Apache license 2.0: https://github.com/sbaiz/IBMi2Mongo Setting Up The Environment The first thing we want to do is to create an environment where we can run our Python code. We will use the Anaconda distribution which allows us to install all dependencies in a single go (including Python and MongoDB). There are other solutions out there, but as this is what I am used to, it’s the one I will use. Anaconda comes with a full-blown package manager called conda which makes installing new packages and updating existing ones very easy. You should already have Anaconda installed on your system, if not you should take a look at https://www.anaconda.com/download/#linux-64bit Once you have installed it, open up a terminal and run: conda list # List all installed packages conda info –envs # List all environments available conda create -n mongodb python=3 # Create an environment named ‘mongodb’ with Python 3 activated source activate mongodb # Activate the ‘mongodb’ environment. Great! Now we have an environment with Python 3 installed. Let’s see if everything is working as expected: python –version # Version of the running interpreter python3 –version # Version of the running interpreter pip3 –version # Version of the installed package manager conda list # List all packages installed in this environment conda info -e mongodb | grep python # Show us which version of Python is active in this environment If everything works fine you should be able to run these commands in your terminal and get a version number as output. If you get an error message you should check that: You have activated the “mongodb” environment ( source activate mongodb ) You are using a 64-bit version of Python 3 (you can check this using python3 –version or python -c ‘import platform; print(platform.architecture()[0])’ ) You are using Python 3 (you can check this by running: python –version , or by checking if pip shows pip3 when you call it) The Anaconda package manager has been activated correctly (you can do so by typing conda info -e mongodb | grep python ) All dependencies for the project are automatically installed when you create a new environment like this, but if not, make sure that you install them manually. Otherwise please check your operating system documentation on how to create software environments, and try to configure one yourself – it’s quite easy! Make sure that all versions match. The example above shows only some basic commands for checking what is going on with your Python installation, there are many more things to consider here – depending on whether or not your system supports virtualenv or conda environments for example! This should be enough to get your environment ready and to make sure that everything is configured correctly! Install MongoDB You can install MongoDB in many different ways, but as this is a very simple example I will just use the default package manager of Anaconda: conda . To do so, simply run: conda install mongodb -c conda-forge # Install MongoDB from the conda channel conda create -n mongo python=3 # Create an environment named ‘mongodb’ with Python 3 activated source activate mongo # Activate the ‘mongodb’ environment. pip install pymongo # Install the Python client for MongoDB Now we have a functioning instance of MongoDB running on port 27017. To check if it’s actually up and running you can open up a new terminal window, connect to your database using pymongo , try to list all databases ( db.list() ) or execute some other basic operations (for example db.foo.find() ) using the Python client and see if they return any results! If not, please check that your path variable is set correctly in your bash profile: PATH=$PATH:/path/to/anaconda/bin . Make sure you restart your terminal session after adding this line to get everything working again! Alternatively you can also try setting $MONGODB_HOME inside of a shell script where you run pymongo from – for example /home/you/.bashrc – which should work without restarting your terminal session too. Just remember that every time you activate an environment like this one ( source activate mongodb ), all changes made inside will be wiped out automatically when switching back to your normal shell – so you will need to add this to your .bashrc before activating your environment. In order to see if everything is working correctly, try accessing the database using the MongoDB shell with mongo from the command line: mongo # Create a new database called ‘test’ (you can change this later) use test # Switch to the test database show dbs # List all databases db.dbs.find() # Show all databases db.test.find() # Show only information about the ‘test’ database exit # Exit the shell If you are familiar with SQL and MongoDB you should be able to do some basic data manipulation like creating new collections, inserting data into them and retrieving it back out again! Make sure you don’t forget the $ in front of everything when calling commands like show dbs – otherwise Python will try to interpret them as commands rather than queries! Also remember that in MongoDB all queries are written as JSON documents which is why keys such as find() or where() look very strange compared to SQL – but have their reasons! In general it is recommended not to use these kinds of query languages directly, but instead go for a high level abstraction like SQLAlchemy which we will discuss next! SQLAlchemy And The Object Relational Mapper (ORM) Pattern For Python 3.6+ On An IBM i System Now that we have shown how easy it is to install and setup a functional environment, let’s move on and talk about what we actually want to achieve: Using Python 3 together with SQLAlchemy ORM pattern in order to access legacy IMS-related data from an IBM i system (i5/OS V5R4M0) using Python 3.6+ on IBM.


Marco Lopes

Excessive Crafter of Things

0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *