Shantanu's Blog
Database Consultant
April 29, 2016
Finding the most relevant values for a given list
Let's assume we have a list of "good" numbers and we need to find the relevance of a number for e.g. 103 to this list. In this case the weight of 103 in the context of the given list is 3.94 We can calculate the weight of all numbers, sort and take top 4 or 5
Here is how to generate test data for this exercise...
mylist=[100, 101, 102, 104, 105, 106, 107, 220, 221, 289, 290, 542, 544 ]
import pandas as pd
columnA=pd.DataFrame(mylist)
secondlist=[103, 299, 999, 108, 543]
import pandas as pd
row1=pd.DataFrame(secondlist)
for i in row1[0]:
columnA[i] = 1/ abs(columnA[0] - i)
# dataframe looks like this excel sheet
columnA
mypd=columnA.sum()
mypd.sort_values(inplace=True,ascending=False)
mypd
0 2831.000000
103 3.948958
108 2.551252
543 2.030021
299 0.280611
999 0.017591
dtype: float64
Labels: pandas, python, redshift, usability
April 28, 2016
pass all data to API gateway
It is possible to pass through the entire request body with a mapping template like this:
#set($inputRoot = $input.path('$'))
{
"body" : $input.json('$')
}
If you wanted to pass in a subset of the request body change the '$' selector to the desired JsonPath.
http://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-mapping-template-reference.html
Labels: aws, aws_lambda, usability
April 26, 2016
aws kinesis to elasticsearch
Here is 3 lines of code that will push a record to elasticsearch via kinesis firehose. It can be used for streaming data.
import boto3
firehose_client = boto3.client('firehose', region_name="us-east-1")
response = firehose_client.put_record(
DeliveryStreamName='kfh',
Record={
'Data': json.dumps({'client_name': 'pqr', 'client_code': 8})
}
)
_____
The equivalent AWS Lambda function will look something like this.
import boto3
import json
firehose_client = boto3.client('firehose', region_name="us-east-1")
def lambda_handler(event, context):
myRecord={'client_name': str(event['client_name']), 'client_code': int(event['client_code']) }
response = firehose_client.put_record(
DeliveryStreamName='kn_to_es',
Record={'Data': json.dumps(myRecord)}
)
return response
Labels: aws, usability
April 24, 2016
Using docker differently
1) There is no need to run each container as deamon. We can call the container as and when required like this...
[root@ip-172-31-8-33 ec2-user]# docker run ubuntu echo "hello world"
hello world
We can also use python to execute such command.
How to install:
pip install -e git+https://github.com/deepgram/sidomo.git#egg=sidomo
How to use:
from sidomo import Container
with Container('ubuntu') as c:
for output_line in c.run('echo hello world'):
print(output_line)
2) Docker containers can also be used for testing purpose. selenium browser testing is possible using docker. for e.g.
## http://testdetective.com/selenium-grid-with-docker/
docker run -d -P --name selenium-hub selenium/hub
docker run -d --link selenium-hub:hub selenium/node-firefox
docker run -d -P --link selenium-hub:hub selenium/node-chrome
3) There are other containers those can be used and the list can be found here...
https://blog.jessfraz.com/post/docker-containers-on-the-desktop/
Some of the containers from the above link are listed:
# run tor proxies
docker run -d --restart always -v /etc/localtime:/etc/localtime:ro -p 9050:9050 --name torproxy jess/tor-proxy
docker run -d --restart always -v /etc/localtime:/etc/localtime:ro --link torproxy:torproxy -p 8118:8118 --name privoxy jess/privoxy
# help the tor network by running a bridge relay!
docker run -d -v /etc/localtime:/etc/localtime --restart always -p 9001:9001 --name tor-relay jess/tor-relay -f /etc/tor/torrc.bridge
# IRC Client
docker run -it -v /etc/localtime:/etc/localtime -v $HOME/.irssi:/home/user/.irssi --read-only --name irssi jess/irssi
# email app
docker run -it -v /etc/localtime:/etc/localtime -e GMAIL -e GMAIL_NAME -e GMAIL_PASS -e GMAIL_FROM -v $HOME/.gnupg:/home/user/.gnupg --name mutt jess/mutt
# text based twitter client
docker run -it -v /etc/localtime:/etc/localtime -v $HOME/.rainbow_oauth:/root/.rainbow_oauth -v $HOME/.rainbow_config.json:/root/.rainbow_config.json --name rainbowstream jess/rainbowstream
# lynx browser
docker run -it --name lynx jess/lynx
# chrome
docker run -it --net host --cpuset-cpus 0 --memory 512mb -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=unix$DISPLAY -v $HOME/Downloads:/root/Downloads -v $HOME/.config/google-chrome/:/data --device /dev/snd --name chrome jess/chrome
Labels: aws, docker, usability
April 22, 2016
Pinterest to RSS feed discovery
I have written an API to generate RSS feed of any pinterest user and all the boards owned by him / her. This has been done using Amazon API gateway and AWS Lambda. Here is how it works...
The pinterest feed discovery will be available after supplying the pin url, somehting like this...
https://f84jmref3f.execute-api.us-east-1.amazonaws.com/stag/pintorss?url=http://in.pinterest.com/shantanuo/buy/
["http://in.pinterest.com/shantanuo/buy.rss", ["http://in.pinterest.com/shantanuo/feed.rss", "http://in.pinterest.com/shantanuo/my-wishlist.rss", "http://in.pinterest.com/shantanuo/truth.rss", "http://in.pinterest.com/shantanuo/tips.rss"]]
The first feed url returns the exact feed of "buy" board. There are other boards owned by the user "shantanuo". They are also returned as list.
_____
And here is the source code:
1) API Gateway:
a) Method Execution - Get Method Request - URL Query String Parameters - add a variable call url
b) Method Execution - Get Integration Request - Mapping Templates - Content-Type - application/json - Mapping template
{"url": "$input.params('url')"}
The above will link the variables received from the API gateway to the lambda function event dictionary.
2) Lambda function:
def lambda_handler(event, context):
myurl=event['url']
x=feedspot(myurl)
return x.url_others()
The feedspot class can be copied from ...
https://gist.github.com/shantanuo/e6112e464276e4ccbc34c36620b811f8
Labels: aws, aws_lambda, python, usability
April 17, 2016
Docker tips
# CMD and ENTRYPOINT
CMD is the default command to execute when an image is run. The default ENTRYPOINT is /bin/sh -c and CMD is passed into that as an argument. We can override ENTRYPOINT in our Dockerfile and make our container behave like an executable taking command line arguments (with default arguments in CMD in our Dockerfile).
docker can overwrite entrypoint like this...
docker run -ti --entrypoint=bash cassandra
# In this case we override the command but entrypoint remains ls
docker run training/ls -l
# in Dockerfile
ENTRYPOINT /bin/ls
CMD ["-a"]
# volumes in #docker are read-write by default but there's an :ro flag
# Use —rm with #docker whenever you use -it to have containers automatically clean themselves up.
# Docker User Interface:
docker run -d -p 9000:9000 -v /var/run/docker.sock:/docker.sock --name dockerui abh1nav/dockerui:latest -e="/docker.sock"
# docker clean up container
docker run -v /var/run/docker.sock:/var/run/docker.sock -v /var/lib/docker:/var/lib/docker --rm martin/docker-cleanup-volumes --dry-run
# install docker machine
curl -L https://github.com/docker/machine/releases/download/v0.6.0/docker-machine-`uname -s`-`uname -m` > /usr/local/bin/docker-machine && \
chmod +x /usr/local/bin/docker-machine
# copy files using shell tricks
$ docker exec -i kickass_yonath sh -c ‘mkdir -p /home/e1/e2'
$ docker exec -i kickass_yonath sh -c 'cat > /home/e1/e2/test.txt' < ./test.txt
# Clean up lots of containers at once (ubuntu based in this case) with:
$ docker ps -a | grep ubuntu | awk ‘{print $1}’ | xargs docker stop
$ docker ps -a | grep ubuntu | awk ‘{print $1}’ | xargs docker rm
# Clean up lots of images at once (in this case where they have no name) with:
$ docker images | grep ‘’ | awk ‘{print $3}’ | xargs docker rmi
# Just pipe docker ps output to less -S so that the table rows are not wrapped:
docker ps -a | less -S
Labels: aws, docker, usability
April 15, 2016
Testing Lambda functions using Docker
Amazon lambda functions can be tested locally using docker as explained in this article...
https://aws.amazon.com/blogs/compute/cloudmicro-for-aws-speeding-up-serverless-development-at-the-coca-cola-company/
_____
git clone https://github.com/Cloudmicro/lambda-dynamodb-local.git
cd lambda-dynamodb-local
docker-compose up -d
docker-compose run --rm -e FUNCTION_NAME=hello lambda-python
_____
If docker compose is not installed, then follow these steps:
curl -L https://github.com/docker/compose/releases/download/1.7.0/docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
Labels: aws, aws_lambda, boto, docker, python, usability
April 14, 2016
patterns in python
Pattern is a web mining module for Python. It has tools for:
Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM parser
Natural Language Processing: part-of-speech taggers, n-gram search, sentiment analysis, WordNet
Machine Learning: vector space model, clustering, classification (KNN, SVM, Perceptron)
Network Analysis: graph centrality and visualization.
https://github.com/clips/pattern
(Not yet compatible with python version 3)
Labels: python, usability
April 13, 2016
generate csv dynamically
Here is how to create a two column file with numbers and some text separated by comma.
with open('mp.csv', 'w') as f:
for i in range(9819800000, 9819838475):
f.write("%s, \"not qualified\" \n" % i)
short, elegant, usable, easy to read code in python gets my job done quickly !
Labels: python, usability
April 11, 2016
Docker and wordpress
docker run --name mysqlwp -e MYSQL_ROOT_PASSWORD=dockerRootMySQL \
-e MYSQL_DATABASE=wordpress \
-e MYSQL_USER=wordpress \
-e MYSQL_PASSWORD=wordpresspwd \
-d mysql
docker run --name wordpress --link mysqlwp:mysql -p 80:80 \
-e WORDPRESS_DB_NAME=wordpress \
-e WORDPRESS_DB_USER=wordpress \
-e WORDPRESS_DB_PASSWORD=wordpresspwd \
-d wordpress
_____
vi /tmp/uploads.ini
file_uploads = On
memory_limit = 256M
upload_max_filesize = 256M
post_max_size = 300M
max_execution_time = 600
docker run --name wordpress2 --link mysqlwp:mysql -p 80:80 -m 64m \
-e WORDPRESS_DB_PASSWORD=wordpresspwd \
-v "/opt/wordpress":/var/www/html \
-v "/tmp/uploads.ini:/usr/local/etc/php/conf.d/uploads.ini" \
-d wordpress
# Using #docker locally accessible site can be started using -p 127.0.0.1::80
# open refine utility
# docker run --privileged -v /openrefine_projects/:/mnt/refine -p 35181:3333 -d psychemedia/ou-tm351-openrefine
Or use this:
docker run -p 3334:3333 -v /mnt/refine -d psychemedia/docker-openrefine
Freeswitch:
docker run -d sous/freeswitch
mongodb:
docker run -d -p 27017:27017 -p 28017:28017 -e MONGODB_PASS="mypass" tutum/mongodb
Monitor containers:
docker run -d --name=consul --net=host gliderlabs/consul-server -bootstrap -advertise=52.200.204.48
$ docker run -d \
--name=registrator \
--net=host \
--volume=/var/run/docker.sock:/tmp/docker.sock \
gliderlabs/registrator:latest \
consul://localhost:8500
Labels: aws, docker, usability
A docker-compose stack for Prometheus monitoring
here are the commands to start prometheus on port 9090 and cadvisor on 8080
git clone https://github.com/vegasbrianc/prometheus.git
cd prometheus/
/usr/local/bin/docker-compose up -d
Labels: aws, docker, usability
load balancer using docker
# create a new directory and use it for the project
mkdir vg
cd vg
# install docker compose
curl -L https://github.com/docker/compose/releases/download/1.6.2/docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
# download compose example
git clone https://github.com/vegasbrianc/docker-compose-demo.git .
# start
/usr/local/bin/docker-compose up -d
# status
/usr/local/bin/docker-compose ps
# test
curl 0.0.0.0
# scale
/usr/local/bin/docker-compose scale web=5
# logs
/usr/local/bin/docker-compose logs
Labels: aws, docker, usability
docker adminer container
This container has adminer.php that you can point to 80 port of the host machine. The exposed port from container is 80
docker run -d -p 80:80 clue/adminer
Download the adminer file and save it to /tmp/ folder.
wget http://www.adminer.org/latest.php -O /tmp/index.php
Oracle:
docker run -d -p 8080:80 -v /tmp/:/app lukaszkinder/apache-php-oci8-pdo_oci
MongoDB:
docker run -d -p 8070:80 -v /tmp:/var/www/html ishiidaichi/apache-php-mongo-phalcon
Labels: aws, docker, linux tips, python, usability
April 10, 2016
MySQL using docker
1) Create mysql data directory
mkdir -p /opt/Docker/masterdb/data
2) Create my.cnf file
mkdir -p /opt/Docker/masterdb/cnf
vi /opt/Docker/masterdb/cnf/config-file.cnf
# Config Settings:
[mysqld]
server-id=1
innodb_buffer_pool_size = 2G
innodb_log_file_size=212M
binlog_format=ROW
log-bin
3) Run docker image
docker run --name masterdb -v /opt/Docker/masterdb/cnf:/etc/mysql/conf.d -v /opt/Docker/masterdb/data:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=mysecretpass -d shantanuo/mysql
Official docker image for mysql using simply mysql:5.5
You can also use percona or an image by oracle using percona:5.5 OR mysql/mysql-server:5.5
Labels: aws, docker, mysql, mysql tips, shell script, usability
galera mysql cluster in docker
$ sudo docker run --detach=true --name node1 -h node1 erkules/galera:basic --wsrep-cluster-name=local-test --wsrep-cluster-address=gcomm://
$ sudo docker run --detach=true --name node2 -h node2 --link node1:node1 erkules/galera:basic --wsrep-cluster-name=local-test --wsrep-cluster-address=gcomm://node1
$ sudo docker run --detach=true --name node3 -h node3 --link node1:node1 erkules/galera:basic --wsrep-cluster-name=local-test --wsrep-cluster-address=gcomm://node1
# sudo docker exec -ti node1 mysql
mysql> show status like "wsrep_cluster_size";
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+
1 row in set (0.00 sec)
_____
deploy Galera Cluster over multiple servers
By design, Docker containers are reachable using port-forwarded TCP ports only, even if the containers have IP addresses. Set up port forwarding for all TCP ports that are required for Galera to operate.
docker run -d -p 3306:3306 -p 4567:4567 -p 4444:4444 -p 4568:4568 --name nodea erkules/galera:basic --wsrep-cluster-address=gcomm:// --wsrep-node-address=10.10.10.10
docker run -d -p 3306:3306 -p 4567:4567 -p 4444:4444 -p 4568:4568 --name nodeb erkules/galera:basic --wsrep-cluster-address=gcomm://10.10.10.10 --wsrep-node-address=10.10.10.11
docker run -d -p 3306:3306 -p 4567:4567 -p 4444:4444 -p 4568:4568 --name nodec erkules/galera:basic --wsrep-cluster-address=gcomm://10.10.10.10 --wsrep-node-address=10.10.10.12
docker exec -t nodea mysql -e 'show status like "wsrep_cluster_size"'
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+
Labels: docker, mysql, usability
Enable password on ec2
It is possible to create a new user with sudo permissions on an ec2 instance. We can enable password authentication and remove pem file dependency. The security can be managed by "security group".
1) Add a user called "dev" and set the password
sudo useradd -s /bin/bash -m -d /home/dev -g root dev
sudo passwd dev
2) All sudo access to the dev user
sudo visudo
dev ALL=(ALL:ALL) ALL
3) Enable password authentication in config file
vi /etc/ssh/sshd_config
PasswordAuthentication yes
#PasswordAuthentication no
4) Restart ssh
sudo /etc/init.d/sshd restart
Other solution:
# Turn on password authentication
sudo sed -i 's/^PasswordAuthentication.*/PasswordAuthentication yes/' /etc/ssh/sshd_config
# Reload SSHd configuration
sudo service sshd reload
# Set a password for the ec2-user
sudo passwd ec2-user
Labels: aws, usability
Archives
June 2001
July 2001
January 2003
May 2003
September 2003
October 2003
December 2003
January 2004
February 2004
March 2004
April 2004
May 2004
June 2004
July 2004
August 2004
September 2004
October 2004
November 2004
December 2004
January 2005
February 2005
March 2005
April 2005
May 2005
June 2005
July 2005
August 2005
September 2005
October 2005
November 2005
December 2005
January 2006
February 2006
March 2006
April 2006
May 2006
June 2006
July 2006
August 2006
September 2006
October 2006
November 2006
December 2006
January 2007
February 2007
March 2007
April 2007
June 2007
July 2007
August 2007
September 2007
October 2007
November 2007
December 2007
January 2008
February 2008
March 2008
April 2008
July 2008
August 2008
September 2008
October 2008
November 2008
December 2008
January 2009
February 2009
March 2009
April 2009
May 2009
June 2009
July 2009
August 2009
September 2009
October 2009
November 2009
December 2009
January 2010
February 2010
March 2010
April 2010
May 2010
June 2010
July 2010
August 2010
September 2010
October 2010
November 2010
December 2010
January 2011
February 2011
March 2011
April 2011
May 2011
June 2011
July 2011
August 2011
September 2011
October 2011
November 2011
December 2011
January 2012
February 2012
March 2012
April 2012
May 2012
June 2012
July 2012
August 2012
October 2012
November 2012
December 2012
January 2013
February 2013
March 2013
April 2013
May 2013
June 2013
July 2013
September 2013
October 2013
January 2014
March 2014
April 2014
May 2014
July 2014
August 2014
September 2014
October 2014
November 2014
December 2014
January 2015
February 2015
March 2015
April 2015
May 2015
June 2015
July 2015
August 2015
September 2015
January 2016
February 2016
March 2016
April 2016
May 2016
June 2016
July 2016
August 2016
September 2016
October 2016
November 2016
December 2016
January 2017
February 2017
April 2017
May 2017
June 2017
July 2017
August 2017
September 2017
October 2017
November 2017
December 2017
February 2018
March 2018
April 2018
May 2018
June 2018
July 2018
August 2018
September 2018
October 2018
November 2018
December 2018
January 2019
February 2019
March 2019
April 2019
May 2019
July 2019
August 2019
September 2019
October 2019
November 2019
December 2019
January 2020
February 2020
March 2020
April 2020
May 2020
July 2020
August 2020
September 2020
October 2020
December 2020
January 2021
April 2021
May 2021
July 2021
September 2021
March 2022
October 2022
November 2022
March 2023
April 2023
July 2023
September 2023
October 2023
November 2023
April 2024
May 2024
June 2024
August 2024
September 2024
October 2024
November 2024
December 2024
January 2025