An (not so big) intro to DOCKER

Docker

Docker is a platform to develop, ship, and run applications. According to the Wikipedia, Docker is an open-source project that automates the deployment of applications inside software containers, by providing an additional layer of abstraction and automation of operating system–level virtualization on Linux. Docker mainly facilitates to run almost any application inside a securely isolated container in multi-tenant (or simply, multi-container) environments. Since there is no hypervisor like the traditional approach, this lightweight containers can really get better performance.

Docker uses resource isolation features of the Linux kernel such as cgroups and kernel namespaces to allow independent “containers” to run within a single Linux instance, avoiding the overhead of starting virtual machines. Linux kernel’s namespaces completely isolate an application’s view of the operating environment, including process trees, network, user IDs and mounted file systems, while cgroups provide resource isolation, including the CPU, memory, block I/O and network. Docker includes the libcontainer library as a reference implementation for containers, and builds on top of libvirt, LXC (Linux containers) and systemdnspawn, which provide interfaces to the facilities provided by the Linux kernel.

With just enough of the definition, lets get our hand dirty with a first run in the docker.
You can just go to the Try DOCKER and run the below commands:

Docker prompt and Version:

you@tutorial:~$ docker
Usage: Docker [OPTIONS] COMMAND [arg...]
-H="127.0.0.1:4243": Host:port to bind/connect to

A self-sufficient runtime for linux containers.

Commands:

attach    Attach to a running container
build     Build a container from a Dockerfile
commit    Create a new image from a container's changes
diff      Inspect changes on a container's filesystem
export    Stream the contents of a container as a tar archive
history   Show the history of an image
images    List images
import    Create a new filesystem image from the contents of a tarball
info      Display system-wide information
insert    Insert a file in an image
inspect   Return low-level information on a container
kill      Kill a running container
login     Register or Login to the Docker registry server
logs      Fetch the logs of a container
port      Lookup the public-facing port which is NAT-ed to PRIVATE_PORT
ps        List containers
pull      Pull an image or a repository from the Docker registry server
push      Push an image or a repository to the Docker registry server
restart   Restart a running container
rm        Remove a container
rmi       Remove an image
run       Run a command in a new container
search    Search for an image in the Docker index
start     Start a stopped container
stop      Stop a running container
tag       Tag an image into a repository
version   Show the Docker version information
wait      Block until a container stops, then print its exit code

Pulling from Repository:

Lets search for a image called tutorial in the repository and pull it into our environment.

you@tutorial:~$ docker search tutorial
Found 1 results matching your query ("tutorial")
NAME                      DESCRIPTION
learn/tutorial            An image for the interactive tutorial
you@tutorial:~$ docker pull learn/tutorial
Pulling repository learn/tutorial from https://index.docker.io/v1
Pulling image 8dbd9e392a964056420e5d58ca5cc376ef18e2de93b5cc90e868a1bbc8318c1c (precise) from ubuntu
Pulling image b750fe79269d2ec9a3c593ef05b4332b1d1a02a62b4accb2c21d589ff2f5f2dc (12.10) from ubuntu
Pulling image 27cf784147099545 () from tutorial

Running the Helloworld:

Lets run the very first helloworld thing with echo command:

you@tutorial:~$ docker run learn/tutorial echo "My Worderful HelloWorld in Docker"
My Worderful HelloWorld in Docker

Install package and commit:

Lets install the ping tools from apt-get repo. Once that is done, lets save this image with a new name "ping"
Once the commit is done, we can push it back to the [docker-hub] (https://hub.docker.com)

you@tutorial:~$ docker run learn/tutorial apt-get install -y ping
Reading package lists...
Building dependency tree...
The following NEW packages will be installed:
  iputils-ping
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 56.1 kB of archives.
After this operation, 143 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu/ precise/main iputils-ping amd64 3:20101006-1ubuntu1 [56.1 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 56.1 kB in 1s (50.3 kB/s)
Selecting previously unselected package iputils-ping.
(Reading database ... 7545 files and directories currently installed.)
Unpacking iputils-ping (from .../iputils-ping_3%3a20101006-1ubuntu1_amd64.deb) ...
Setting up iputils-ping (3:20101006-1ubuntu1) ...
you@tutorial:~$ docker run learn/ping ping www.google.com
PING www.google.com (74.125.239.129) 56(84) bytes of data.
64 bytes from nuq05s02-in-f20.1e100.net (74.125.239.148): icmp_req=1 ttl=55 time=2.23 ms
64 bytes from nuq05s02-in-f20.1e100.net (74.125.239.148): icmp_req=2 ttl=55 time=2.30 ms
64 bytes from nuq05s02-in-f20.1e100.net (74.125.239.148): icmp_req=3 ttl=55 time=2.27 ms
64 bytes from nuq05s02-in-f20.1e100.net (74.125.239.148): icmp_req=4 ttl=55 time=2.30 ms

you@tutorial:~$ docker ps -l
ID                  IMAGE               COMMAND                CREATED             STATUS              PORTS
6982a9948422        ubuntu:12.04        apt-get install ping   1 minute ago        Exit 0

you@tutorial:~$ docker commit 698 learn/ping
effb66b31edb

you@tutorial:~$ docker push learn/ping

Detail about a container:

We can easily inspect the details of a running container by using the inspect command. We don’t need to write the whole ID, just first 3~4 characters will suffice.

you@tutorial:~$ docker ps
ID                  IMAGE               COMMAND               CREATED             STATUS              PORTS
efefdc74a1d5        learn/ping:latest   ping www.google.com   37 seconds ago      Up 36 seconds
you@tutorial:~$ docker inspect efe
[2013/07/30 01:52:26 GET /v1.3/containers/efef/json
{
  "ID": "efefdc74a1d5900d7d7a74740e5261c09f5f42b6dae58ded6a1fde1cde7f4ac5",
  "Created": "2013-07-30T00:54:12.417119736Z",
  "Path": "ping",
  "Args": [
      "www.google.com"
  ],
  "Config": {
      "Hostname": "efefdc74a1d5",
      "User": "",
      "Memory": 0,
      "MemorySwap": 0,
      "CpuShares": 0,
      "AttachStdin": false,
      "AttachStdout": true,
      "AttachStderr": true,
      "PortSpecs": null,
      "Tty": false,
      "OpenStdin": false,
      "StdinOnce": false,
      "Env": null,
      "Cmd": [
          "ping",
          "www.google.com"
      ],
      "Dns": null,
      "Image": "learn/ping",
      "Volumes": null,
      "VolumesFrom": "",
      "Entrypoint": null
  },
  "State": {
      "Running": true,
      "Pid": 22249,
      "ExitCode": 0,
      "StartedAt": "2013-07-30T00:54:12.424817715Z",
      "Ghost": false
  },
  "Image": "a1dbb48ce764c6651f5af98b46ed052a5f751233d731b645a6c57f91a4cb7158",
  "NetworkSettings": {
      "IPAddress": "172.16.42.6",
      "IPPrefixLen": 24,
      "Gateway": "172.16.42.1",
      "Bridge": "docker0",
      "PortMapping": {
          "Tcp": {},
          "Udp": {}
      }
  },
  "SysInitPath": "/usr/bin/docker",
  "ResolvConfPath": "/etc/resolv.conf",
  "Volumes": {},
  "VolumesRW": {}

Reference:
1. [Docker Documentation] (https://docs.docker.com/)
2. [Understanding Docker] (https://docs.docker.com/introduction/understanding-docker/)

Advertisements

Python *args and **kwargs demystified

*args and **kwargs demystified

Having hard time understanding the *args and **kwargs magic variables in python? well, this article will just try to ease this a bit. In Fact, the concept is very simple though they look fearsome due to look alikeness with C/C++ pointer thingies. *args and **kwargs are mostly used in function definitions. *args and **kwargs allow you to pass a variable number of arguments to a function. Variable number means you don’t know the quantity in advance. Lets see an example of *args:

def foo(arg, *args):
    print "first variable:{}".format(arg)
    count = 2 
    for item in args:
        print "variable no:{} => {}".format(count,item)
        count += 1


foo('python','is','better','than','java')

joy [chef_painting] $ python args.py 
first variable:python
variable no:2 => is
variable no:3 => better
variable no:4 => than
variable no:5 => java

With **kwargs, we can actually send key, value pairs to python function.

def foo(**kwargs):
    count = 1
    if kwargs is not None:
        for k, v in kwargs.iteritems():
            print "#{} key:{} =>value:{}".format(count,k,v)
            count +=1


foo(lang='python',version='2.7',author='joy')

joy [chef_painting] $ python kwargs.py 
#1 key:lang => value:python
#2 key:version => value:2.7
#3 key:author => value:joy

Linux Perf Tools

Linux Performance Observability

I couldn’t resist myself to share this amazing diagram – with due credit to The Linux Foundations. As you can see from the figure, it is pointing to all the tools/commands that you can use to find the performance related statistics from each component of the whole stack.

Alt

pdb – The Python Debugger

Debugging with pdb

Very first thing, read this blog by Steve Ferg’s which explains pdb in details.

Here are the summary of the necessary commands:

import pdb
setting up break points by pdb.set_trace()

in-debug keys:

- n (next)
- ENTER (repeat previous)
- q (quit)
- p <variable> (print value)
- c (continue) 
- l (list where you are)
- s (step into subroutine)
- r (continue till the end of the subroutine)
- ! <python command>

Python Lambda

পাইথন ল্যাম্বডা এক্সপ্রেশন  অনেক জায়গায় ব্যবহার হতে দেখা যায়। যদিও প্রথম দেখায় বিদঘুটে মনেহয়, আদতে সেরকম কঠিন কিছুনা । পাইথন ল্যাম্বডা মূলত দ্রত সরল ১ লাইনের ফাংশন (যেটার বারংবার কল হওয়ার সম্ভাবনা কম) তৈরিতে কাজে লাগানো হয়। আমরা ছোটখাটো একটা সহজ উদাহরণ দিয়ে দেখার চেষ্টা করি ।

>>> one_liner_add = lambda a, b : a+b

>>> print one_liner_add(4,5)
9

>>> print one_liner_add(50,10)
60

এখানে উলেক্ষ্য যে একপ্রেশনে কোন ইনপুট ভেরিয়েবল বা রিটার্ন ভ্যালু এক্সপ্লিসিটলি উল্লেখ করা লাগেনা। সরাসরি ল্যাম্বডা এক্সপ্রেশন বা এসাইন্ড(assigned) ফাংশন অবজেক্ট ভেরিয়েবল কে যথাযথ প্যারামিটার দিয়ে কল করলেই চলে । আরেকটি উদাহরণ দেখা যাক ঃ

>>> my_list = ['Joy', 'Bangldesh', 'dhaka', 'bandarban', 'abahoni', 'Notredame College', 'DU' ]
>>> my_list.sort()
>>> print my_list
['Bangldesh', 'DU', 'Joy', 'Notredame College', 'abahoni', 'bandarban', 'dhaka']

আমরা যদি সর্ট ফাংশন ব্যাবহার করি, তাহলে তা কেস সেন্সিটিভিটিতে গোলমেলে রেজাল্ট দিচ্ছে।  সেক্ষেত্রে আমরা key প্যারামিটার ডিফাইন করে ব্যাপারটা ঠিক করতে পারি। পুরোদস্তর একটা ফাংশন না লিখে, ল্যাম্বডা এক্সপ্রেশন  দিয়েই কাজটা সহজেই করা যায় ।

>>> my_list.sort(key =  lambda x: x.lower())
>>> print my_list
['abahoni', 'bandarban', 'Bangldesh', 'dhaka', 'DU', 'Joy', 'Notredame College']

 

 

 

Bulk Renaming Python Script

During a map reduce job, i was in need of bulk renaming input files to a specific pattern. Hence, wrote the below simple python script that can just do the job.

import os
import sys

run_ok = False
if len(sys.argv) == 5:
    directory_name = sys.argv[1]
    old_extension = sys.argv[2]
    name_pattern  = sys.argv[3]
    new_extension = sys.argv[4]
    print "Running jobs on directory: {} having extensions {}".format(directory_name, old_extension)
    run_ok = True

else:
    print "SYNTAX : python file_rename.py directory_name matching_extension new_name_pattern new_extension"
    pass

if (run_ok):
    list = os.listdir( directory_name )
    counter = 1
    for old_file in list:
        if old_extension in old_file:
            new_file = name_pattern + str( counter ) + "." + new_extension
            os.rename( directory_name + "/"+ old_file, directory_name + "/" + new_file )
            print "file renamed from {} to {}".format( old_file, new_file )
            counter += 1

paste deploy script

PasteDeploy is a library used by WSGI Middlewares which makes it possible to configure WSGI components together declaratively within an .ini file. It was developed by Ian Bicking.

Entry Points and Paste Deploy .ini files:

###
# app configuration
###
[app:main]
use = egg:MyProject

OR

[app:main]
paste.app_factory = configured:app_factory
name = Joy
greeting = ZeroVM

###
# wsgi server configuration
###
[server:main]
use = egg:waitress#main
host = 0.0.0.0
port = 6543

At a minimum, the configuration file defines an application instance and the server which runs it. In the First form of the configuration, The line in [app:main] above that says use = egg:MyProject is actually shorthand for a longer spelling: use = egg:MyProject#main. The #main part is omitted for brevity, as #main is a default defined by PasteDeploy. egg:MyProject#main is a string which has meaning to PasteDeploy. It points at a setuptools entry point named main defined in the MyProject project.
In the Second Form of the configuration,
– Line 1 begins a section with the app: prefix, used to define a WSGI application endpoint. The name main makes this application the default for this file.
– Line 2 tells PasteDeploy to look up the function app_factory in the module configured, and call it to get the application.
– Lines 3-4 provide configuration values to be are passed to the factory.

Take a look at the generated setup.py file for this project.

import os

from setuptools import setup, find_packages

here = os.path.abspath(os.path.dirname(__file__))
with open(os.path.join(here, 'README.txt')) as f:
    README = f.read()
with open(os.path.join(here, 'CHANGES.txt')) as f:
    CHANGES = f.read()

requires = [
    'pyramid',
    'pyramid_chameleon',
    'pyramid_debugtoolbar',
    'waitress',
    ]

setup(name='MyProject',
      version='0.0',
      description='MyProject',
      long_description=README + '\n\n' + CHANGES,
      classifiers=[
        &amp;quot;Programming Language :: Python&amp;quot;,
        &amp;quot;Framework :: Pyramid&amp;quot;,
        &amp;quot;Topic :: Internet :: WWW/HTTP&amp;quot;,
        &amp;quot;Topic :: Internet :: WWW/HTTP :: WSGI :: Application&amp;quot;,
        ],
      author='',
      author_email='',
      url='',
      keywords='web pyramid pylons',
      packages=find_packages(),
      include_package_data=True,
      zip_safe=False,
      install_requires=requires,
      tests_require=requires,
      test_suite=&amp;quot;myproject,
      entry_points='''\
      [paste.app_factory]
      main = myproject:main
      ''',
      )

in the section named [paste.app_factory], there is a key named main (the entry point name) which has a value myproject:main. The key main is what our egg:MyProject#main value of the use section in our config file is pointing at, although it is actually shortened to egg:MyProject there. The value represents a dotted Python name path, which refers to a callable in our myproject package’s __init__.py module.

The egg: prefix in egg:MyProject indicates that this is an entry point URI specifier, where the “scheme” is “egg”. An “egg” is created when you run setup.py install or setup.py develop within your project.

In English, this entry point can thus be referred to as a “PasteDeploy application factory in the MyProject project which has the entry point named main where the entry point refers to a main function in the mypackage module”. Indeed, if you open up the init.py module generated within any scaffold-generated package, you’ll see a main function. This is the function called by PasteDeploy when the pserve command is invoked against our application. It accepts a global configuration object and returns an instance of our application.

[DEFAULT] Section of a PasteDeploy .ini File

You can add a [DEFAULT] section to your PasteDeploy .ini file. Such a section should consists of global parameters that are shared by all the applications, servers and middleware defined within the configuration file. The values in a [DEFAULT] section will be passed to your application’s main function as global_config (see the reference to the main function in init.py).