Shantanu's Blog

Database Consultant

February 19, 2020

 

MySQL case study 184

How do I enable general log of mysql and then query the log using the commands like tail and grep?

mysql> set global general_log_file="general.log";

tail -f general.log | tee -a from_general.txt

# make sure to use "Select" (note the capital s) in your application query and then search for it general log

tail -f general.log | grep Select

grep -i "SELECT " /var/log/mysql/general.log | grep -io "SELECT .*" | sed 's|\(FROM [^ ]*\) .*|\1|' | sort | uniq -c | sort -nr | head -100

grep "from " general.log | awk -Ffrom '{print $2}' | awk '{print $1}' | cat

# Or use packetbeat to push the queries to elastic - for better search experience!

Labels: ,


October 05, 2016

 

shell script to backup all mysql tables

This is a generic shell script that will take the backup of all mysql tabels in CVS format.
I have added --where clause to mysqldump command. It will take backup of 10 rows of each table. This is for testing purpose only. While implementing the script in production evvironment, remove that where clause.

#!/bin/sh
mydate=`date +"%d-%m-%Y-%H-%M"`

for db in `mysql -Bse"show databases"`
do
mydir='/tmp/testme/'$mydate/$db
rm -rf $mydir
mkdir -p $mydir
chmod 777 $mydir

for i in `mysql -Bse "select TABLE_NAME from information_schema.tables where TABLE_SCHEMA = '$db'"`
do
# take backup of only 10 rows for testing purpose
# remove --where clause in production
mysqldump $db $i --where='true limit 10' --tab=$mydir

done
done

Labels: ,


May 21, 2016

 

decouple application using docker compose

# install packages
yum install -y git mysql-server docker

curl -L https://github.com/docker/compose/releases/download/1.7.1/docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose

/etc/init.d/docker start

# create and copy public key to github to clone private repo

ssh-keygen -t rsa -b 4096 -C "user@gmail.com"

cat ~/.ssh/id_rsa.pub


# clone your private repository
git clone git@github.com:shantanuo/xxx.git

# create your custom my.cnf config file

mkdir -p /my/custom/
vi /my/custom/my.cnf
[mysqld]
sql_mode=''

# start docker containers as per yml file

/usr/local/bin/docker-compose up -d

# restore mysql data into container from host:
mysqladmin -h localhost -P 3306 --protocol=tcp -u root -ppasswd create livebox

mysql -h localhost -P 3306 --protocol=tcp -u root -ppasswd livebox < livebox.sql

# access mysql container from another container:

mysql -h mysql -uroot -ppasswd
_____

## use container ID of tutum/mysql:5.5 for e.g.
docker logs c7950eeab8e9
    mysql -uadmin -pxmFShXB1Asgn -h127.0.0.1

# use the password to connect to 127.0.0.1 and execute commands:
mysql> grant all on *.* to 'root'@'%' identified by 'passwd' with grant option;
Query OK, 0 rows affected (0.00 sec)

# data only container to be used for mysql data
docker run -d -v /var/lib/mysql --name db_vol1 -p 23:22 tutum/ubuntu:trusty
docker run -d --volumes-from db_vol1 -p 3306:3306 tutum/mysql:5.5

Labels: , , , , , ,


April 10, 2016

 

MySQL using docker

1) Create mysql data directory
mkdir -p /opt/Docker/masterdb/data

2) Create my.cnf file
mkdir -p /opt/Docker/masterdb/cnf

vi /opt/Docker/masterdb/cnf/config-file.cnf
# Config Settings:
[mysqld]
server-id=1

innodb_buffer_pool_size = 2G
innodb_log_file_size=212M
binlog_format=ROW
log-bin

3) Run docker image

docker run --name masterdb -v /opt/Docker/masterdb/cnf:/etc/mysql/conf.d -v /opt/Docker/masterdb/data:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=mysecretpass -d shantanuo/mysql

Official docker image for mysql using simply mysql:5.5
You can also use percona or an image by oracle using percona:5.5 OR mysql/mysql-server:5.5

Labels: , , , , ,


February 28, 2016

 

docker mysql

You can pull the latest mysql image and create a container named "new-mysql1". Then start the container in second command as shown below:

docker create -p 3306:3306 -e MYSQL_ROOT_PASSWORD=password --name="new-mysql1" mysql:latest

docker start 22914939f301

Or merge create + start into a single "run" command as shown below:

docker run --name new-mysql1 -p 3306:3306 -e MYSQL_ROOT_PASSWORD=password -d mysql/mysql-server:latest
_____

Then you can access from your host using the mysql command line:

mysql -h127.0.0.1 -ppassword -uroot

Labels: , , , , , ,


August 09, 2015

 

Using the same character sets across all tables

It is very important to use the same character set across all tables and acorss all databases.

The default character set used per database are listed using this query...

mysql> select * from information_schema.SCHEMATA;
+--------------+--------------------+----------------------------+------------------------+----------+
| CATALOG_NAME | SCHEMA_NAME        | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME | SQL_PATH |
+--------------+--------------------+----------------------------+------------------------+----------+
| def          | information_schema | utf8                       | utf8_general_ci        | NULL     |
| def          | emailplatform      | latin1                     | latin1_swedish_ci      | NULL     |
| def          | mysql              | latin1                     | latin1_swedish_ci      | NULL     |
| def          | performance_schema | utf8                       | utf8_general_ci        | NULL     |
| def          | test               | latin1                     | latin1_swedish_ci      | NULL     |
| def          | test1              | latin1                     | latin1_swedish_ci      | NULL     |
+--------------+--------------------+----------------------------+------------------------+----------+
6 rows in set (0.01 sec)

If some of the tables from a database does not match with with other tables, we need to alter those tables. for e.g. if 1 or 2 tables from test database are using utf8 encoding, then the best choice is to drop and recreate those tables as latin1
There are 2 points to note here...
1) The table in question should not be part of foregin key relations.
2) The table should not actually contain unicode characters. Because once we convert it to latin1, there will be no way to store unicode.

The easiest way to check if the tables from a given database are using different character set is to use mysqldump command as shown here...
root@ip-10-86-106-75:/home/ubuntu# mysqldump test --no-data | grep ENGINE
) ENGINE=TokuDB DEFAULT CHARSET=latin1;
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
) ENGINE=TokuDB DEFAULT CHARSET=latin1;

This means there are a few tables with utf8 character while others are latin1.
You will have issues while joining a latin1 table with utf8 table. The query will not use indexes if the columns are of different character sets.
_____

The following query run on "information_schema" database shows that there is 1 table in test database that has utf8 collation. All other tables in test database are latin1. Therefore I need to change that single utf8 table to latin1

# create a new table in test database
create table test.tables select * from information_schema.tables;

# check for table_collations

mysql> select table_schema, table_collation, count(*) as cnt from test.tables group by table_schema, table_collation;
+--------------------+-------------------+-----+
| table_schema       | table_collation   | cnt |
+--------------------+-------------------+-----+
| emailplatform      | latin1_swedish_ci |  64 |
| information_schema | utf8_general_ci   |  45 |
| mysql              | latin1_swedish_ci |   1 |
| mysql              | utf8_bin          |   8 |
| mysql              | utf8_general_ci   |  15 |
| performance_schema | utf8_general_ci   |  17 |
| test               | latin1_swedish_ci |  18 |
| test               | utf8_general_ci   |   1 |
| test1              | latin1_swedish_ci |   1 |
+--------------------+-------------------+-----+
9 rows in set (0.00 sec)

To find the name of the table I use this query:

mysql> select table_name from tables where table_schema = 'test' and table_collation like 'utf8%';
+------------+
| table_name |
+------------+
| cdr_master |
+------------+
1 row in set (0.01 sec)

And here is the count:

mysql> select count(*) from test.cdr_master;
+----------+
| count(*) |
+----------+
|   186166 |
+----------+
1 row in set (0.04 sec)

This is relatively small table so we can quickly change the character set.
But we need to check that there is no other table linked to this using foreign key relations.

mysql> select * from information_schema.KEY_COLUMN_USAGE where TABLE_SCHEMA='TEST' AND TABLE_NAME = 'cdr_master' LIMIT 1\G
Empty set (0.00 sec)

mysql> select * from information_schema.KEY_COLUMN_USAGE where REFERENCED_TABLE_SCHEMA='TEST' AND REFERENCED_TABLE_NAME = 'cdr_master' LIMIT 1\G
Empty set (0.06 sec)

_____

To change the default character set we need to drop and recreate the table.

Take the backup of the table.

# mysqldump --tab=/tmp/ test cdr_master

Change the character set:

# sed -i.bak 's/CHARSET=utf8/CHARSET=latin1/' /tmp/cdr_master.sql

Recreate the new table after dropping the old one:

# mysql test < /tmp/cdr_master.sql

# restore data:

mysql> load data infile '/tmp/cdr_master.txt' into table test.cdr_master;
Query OK, 186166 rows affected (25.06 sec)
Records: 186166  Deleted: 0  Skipped: 0  Warnings: 0

Now log-out and log back in to find that all the tables in test database are having the same table_collation.

mysql> select table_collation, count(*) as cnt from information_schema.tables where table_schema = 'test' group by table_collation;
+-------------------+-----+
| table_collation   | cnt |
+-------------------+-----+
| latin1_swedish_ci |  19 |
+-------------------+-----+
1 row in set (0.00 sec)

_____

In order to check if the utf8 encoded table really has any unicode characters, you need to take the backup of the table.

mysqldump test tbl_name > todel.txt

And then run this python script. If the script processes all the lines without any problem then the data is compatible with latin1.

import codecs
f = codecs.open("/tmp/todel.txt", "r", "utf-8")
for line in f.readlines():
    todel=line.decode('latin1')

You have unicode data in your table if you get an unicode error like this...

UnicodeEncodeError: 'ascii' codec can't encode characters in position 31-35: ordinal not in range(128)

In such case, you can not continue with the task of table re-creation unless you decide to ignore the unicode characters being
_____

If you want to check which lines contain unicode characters, you need another type of dump (skip extended insert option will generate a line per record in the table)

# mysqldump test todel --skip-extended-insert > todel.txt

And the following python code will display all the records where unicode characters are used.

import codecs
f = codecs.open("/tmp/todel.txt", "r", "utf-8")
for line in f.readlines():
    try: 
        todel=line.decode('latin1')
    except:
        print line

_____

If recreating the entire table is not an option, then simply change the single column to latin1.

mysql> alter table test.cdr_master modify column call_uuid varchar(50) charset latin1;

This will make the query very fast if the column "call_uuid" is used in the join query. The other table's column "call_uuid" should be also latin1.
_____

if you use extended explain then you will see what mysql is trying to do internally.

mysql> explain extended select * from a inner join b on a.column = b.column
mysql> show warnings;

The warning will show that mysql is trying to convert the b.column from utf8 to latin1 in order to match with a.column. This internal conversion will not allow mysql to use indexes.

Labels: , ,


May 11, 2015

 

Copy data from Redshift to MySQL

Here is python code that will connect to Redshift and pull the data into a dataframe. It will then copy the data to MySQL table. If the table exist, it will replace the table.

import easyboto
x = easyboto.myboto('xxx', 'yyy')mydf = x.runQuery("select *  from pg_table_def where schemaname = 'public'")
mydf.columns = ['schemaname', 'tablename', 'column_nm', 'type', 'encoding', 'distkey', 'sortkey', 'notnull']

import sqlalchemy
engine = sqlalchemy.create_engine('mysql://dba:dba@127.0.0.1/test')
mydf.to_sql('testdata', engine, if_exists='replace')

Labels: , , , ,


May 09, 2015

 

convert any excel file to MySQL using 5 lines of code

2 lines to open and read excel file.
2 lines to connect to mysql server.
1 line to dump data to mysql.
Life can not be easier than this without pandas and python ! :)   

                                            
import pandas as pd
df=pd.read_excel("5ch.xls", parse_dates='True', header=3)

import pymysql
conn = pymysql.connect(host='localhost', port=3306, user='dba', passwd='dba', db='test')

df.to_sql(con=conn, name='temp_aaa', if_exists='replace', flavor='mysql')
_____

You can ofcourse use pandas functions like melt, loc, concat or merge (and several others) before pushing data to mysql.

pd.melt(df, ["number", "reg_no", "st_name"], var_name=["c_six_to_ten"]).sort("reg_no")

df.loc[df['state'] == df['state'].shift(), 'state'] = ''

pd.concat([df,df1], ignore_index=True).drop_duplicates()
 
pd.merge(df, df1, on='InvDate')

Labels: , , ,


May 04, 2015

 

Cursor Example

Cursor can return records in a dictionary format that is easier for python to further process the data. Here is an example of 2 types of cursors.

# insert a record using cursor
import pymysql
conn = pymysql.connect(host='localhost', port=3306, user='dba', passwd='dbapass', db='test')
cur = conn.cursor()
cur.execute("INSERT INTO ptest VALUES ('%s', '%s', '%s')" % ('aa', 'bb', 'test1'))
conn.commit()

# using dictionary cursor
import pymysql
conn = pymysql.connect(host='localhost', port=3306, user='dba', passwd='dbapass', db='test')
cur = conn.cursor(pymysql.cursors.DictCursor)
cur.execute("select * from ptest where candidate_name > '2015-03-01 00:00:00' and candidate_name < '2015-03-31 00:00:00' limit 10")
for rw in cur.fetchall():
    print rw
   

Labels: ,


January 24, 2015

 

query text files

Usually we need to import the CSV data to SQL in order to run the queries. How about running queries directly on the text file?

# git clone https://github.com/harelba/q.git

# cd q

# chmod +x ./bin/q

# ./bin/q "SELECT COUNT(1) FROM examples/exampledatafile"
248

If git is not installed, use the single file found here...

# wget -O q https://cdn.rawgit.com/harelba/q/1.5.0/bin/q?source=install_page&table=1

Interesting!

Labels: , ,


December 22, 2014

 

dealing with mysql issues

While troubleshooting mysql issue, the first place to check is error log. If the error log is clean, then the next option to evaluate is slow query log

# enable slow query log
mysql> set global slow_query_log = on;

# change the default 10 seconds to 1 second
# make sure that the queries not using indexes are logged
SET GLOBAL long_query_time=1;
SET GLOBAL log_queries_not_using_indexes=1;

# If the slow log is growing too fast, feel free to again set the variables back to how they were:

SET GLOBAL long_query_time=10;
SET GLOBAL log_queries_not_using_indexes=0;

# or disable the slow query log
set global slow_query_log = off;

Labels: ,


January 12, 2014

 

Review Database

Here is how to quickly review the database. We can query the information_schema database to find if the column types are as per internal guidelines.

mysql> select DATA_TYPE, COUNT(*) AS CNT, GROUP_CONCAT(DISTINCT(COLUMN_TYPE)) from information_schema.COLUMNS
where TABLE_SCHEMA = 'recharge_db' GROUP BY DATA_TYPE;

+-----------+-----+----------------------------------------------------------------------------------------+
| DATA_TYPE | CNT | GROUP_CONCAT(DISTINCT(COLUMN_TYPE))                                                    |
+-----------+-----+----------------------------------------------------------------------------------------+
| bigint    |  15 | bigint(20)                                                                             |
| datetime  |   6 | datetime                                                                               |
| double    |   3 | double                                                                                 |
| int       |  24 | int(11),int(2)                                                                         |
| text      |   2 | text                                                                                   |
| varchar   |  16 | varchar(100),varchar(500),varchar(10),varchar(20),varchar(15),varchar(255),varchar(25) |
+-----------+-----+----------------------------------------------------------------------------------------+
6 rows in set (0.01 sec)

The suggestions would be:
1) Change double to decimal
2) Change int(2) to tinyint
3) change text to varchar(1000) if possible

mysql> select IS_NULLABLE, COUNT(*) AS CNT from information_schema.COLUMNS
where TABLE_SCHEMA = 'recharge_db' AND COLUMN_KEY != 'PRI'  GROUP BY IS_NULLABLE;
+-------------+-----+
| IS_NULLABLE | CNT |
+-------------+-----+
| NO          |   1 |
| YES         |  52 |
+-------------+-----+
2 rows in set (0.00 sec)

Most of the columns are nullable those should be changed to "NOT NULL".
Certain columns can not be changed to "NOT NULL" and they are candidate for further normalization.

The following query will list all the columns those can be linked to the table that has declared it as Primary Key.

select COLUMN_NAME, COUNT(*) AS CNT, GROUP_CONCAT(IF(COLUMN_KEY = 'PRI', concat(TABLE_NAME, '_primary_key'), TABLE_NAME) order by COLUMN_KEY != 'PRI', TABLE_NAME) as tbl_name
from information_schema.COLUMNS
where TABLE_SCHEMA = 'recharge_db'
group by COLUMN_NAME HAVING CNT > 1 AND tbl_name like '%_primary_key%';
_____

Find missing primary key:

SELECT table_schema, table_name
FROM information_schema.tables
WHERE (table_catalog, table_schema, table_name) NOT IN
(SELECT table_catalog, table_schema, table_name
FROM information_schema.table_constraints
WHERE constraint_type in ('PRIMARY KEY', 'UNIQUE'))
AND table_schema = 'recharge_db';
_____

// Check if the column names are consistent and as per standard

select column_name, count(*) as cnt
from information_schema.columns where TABLE_SCHEMA = 'recharge_db'
GROUP BY column_name;

_____

Data Normalization tips:
1) Data should be normalized
2) There should be no NULL values in any column
3) There should be no need to use "update" statement

Labels: ,


October 29, 2013

 

killing and starting mysql service

Before you kill mysql process from command prompt it is very important to note the mysql parameters. Reading the following output carefully will save a lot of time later on.

root@ip-10-28:/usr# ps aux | grep mysql
root     15711  0.0  0.1   4400   752 pts/1    S    09:53   0:00 /bin/sh ./bin/mysqld_safe
root     16084  1.0 38.7 1584048 233956 pts/1  Sl   09:53   0:00 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=root --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306

In order to start this mysql server again, I will need to change directory to the base directory /usr/ and from there start mysql using ./bin/mysqld_safe &

Labels:


October 25, 2013

 

TokuDB an engine of future


TokuDB is highly recommended for slaves in production environment. Here are 9 advantages of using TokuDB in production.

1) Data warehouse / big data solution:
TokuDB may take the same amount of time to complete the query if the data is small and fits in memory.
But with huge data, InnoDB is not able to perform while tokuDB gives you quick results.

2) Integrates with MySQL:
TOkuDB is an engine just like MyISAM and InnoDB. It is listed as default engine in the output of "show engines" command.

3) Highly compressed data:
TokuDB compresses data upto 15 to 20% compared to MyISAM or InnODB.

Tokudb disk consumption:
-rwxrwx--x 1 root root 222M Oct 25 19:27 _akola_ticket_main_41e0_1_18_B_0.tokudb
-rwxrwx--x 1 root root  68M Oct 25 19:27 _akola_ticket_key_tripindex_41e0_1_18_B_2.tokudb
-rwxrwx--x 1 root root  52M Oct 25 19:27 _akola_ticket_key_route_no_41e0_1_18_B_5.tokudb
-rwxrwx--x 1 root root  50M Oct 25 19:27 _akola_ticket_key_adjtripno_41e0_1_18_B_3.tokudb
-rwxrwx--x 1 root root  36M Oct 25 19:42 _akola_ticket_key_wbindex_41e0_1_18_B_1.tokudb
-rwxrwx--x 1 root root  34M Oct 25 19:27 _akola_ticket_key_NewIndex1_41e0_1_18_B_4.tokudb

MyISAM disk consumption for the same table:
-rw-rw---- 1 root root 1.5G Jul 10 00:36 ticket.MYD
-rw-rw---- 1 root root 335M Jul 10 00:40 ticket.MYI

4) Easily transfer data:
I can simply alter table to toku using alter table statement.

alter table tbl_name engine=TokuDB;

Or I can take the backup without specifing the engine. Since TokuDB is the default engine, it will be converted to tokudb while restoring data.

mysqldump db_name --routines --compatible=no_table_options > db_name_to_toku.sql

5) Crash Revoery:
TOkuDB has one of the excellent crash recovery feature just like InnoDB (that MyISAM is missing!)

6) Replication — Eliminate Slave Lag:
MySQL’s single threaded design often leads to slave lag. With TokuDB, slave lag is eliminated.
This insures replication can be used for read scaling, backups, and disaster recovery, without resorting to sharding, expensive hardware.

7) Hot Schema Changes in Seconds:
TokuDB introduced Hot Column Addition (HCAD). You can add or delete columns from an existing table with minimal downtime.

8) No need to configure parameters:
You may need to increase only open files limit if you get an error something like...
[ERROR] Error in accept: Too many open files
open-files-limit=40000

Other than this you do not need to configure anything! Now compare this with InnoDB.

By default, TokuDB uses about half the memory of the machine as buffer pool (or disk cache). That leaves the other half of memory for the OS to allocate.

9) InnoDB disk consumption:
InnoDB stores data in ibdata files and the disk consumption is not reduced even if you drop tables.
TokuDB tables works like MyISAM and frees the disk space immediately once you drop / truncate tables.

Labels: , ,


September 27, 2013

 

Optimum size of varchars

Here is an excellent explanation about why the varchars should not be more than their optimum size.

MySQL materializes query results to a temporary table, using the MEMORY engine if it can, then converting to MyISAM if it gets too big. The MEMORY engine doesn't support variable length rows, so VARCHARs are converted to CHARs. The field `d` only contains one character in each row, but is defined as a VARCHAR(20000) - which means the temporary table will grow very large. Even when converted to MyISAM, it retains that fixed width format, so the temp table is huge.

Source: http://thenoyes.com/littlenoise/?p=167

Another reason is that the limit of 750 characters on creating indexes may come into the picture. Loosely declared columns can not be indexed properly and composite indexes are almost impossible.

Labels: ,


June 01, 2013

 

Welcome infobright

Infobright is mysql compatible data ware house software. The free community version does not allow to insert / update /delete records. The only way to import data is to use "load data" statement. Here is a simple shell script that will copy the data from mysql to infobright. The table needs to be created in infobright manually. It is not part of this script because mysql datatypes and indexes needs to be modified.

#!/bin/sh

ibdata='test'
ibtable='window_way'

mysqlshow | awk '{print $2}' | while read dbname
do

rm -f /tmp/load.txt
rm -rf /tmp/$ibdata

time mysql -uroot -pPassWdsql $dbname -e"select *, database() into outfile '/tmp/load.txt' from $ibtable"

time mysql-ib -uroot -pPassWdIB $ibdata -e"set @BH_REJECT_FILE_PATH = '/tmp/$ibdata'; set @BH_ABORT_ON_COUNT = 1000; load data infile '/tmp/load.txt' into table $ibtable fields terminated by '\t' "

done

Labels:


March 06, 2013

 

InnoDB force recovery

If I did not shut down the mysql service at the time of stopping the server. Next time I got the following message in the mysql error log file (mysqld.log or /var/log/syslog).

[Note] Plugin 'FEDERATED' is disabled.
InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins
InnoDB: Compressed tables use zlib 1.2.3.4
InnoDB: Initializing buffer pool, size = 128.0M
InnoDB: Completed initialization of buffer pool
InnoDB: highest supported file format is Barracuda.
The log sequence number in ibdata files does not match
the log sequence number in the ib_logfiles!
InnoDB: Database was not shut down normally!
Starting crash recovery.
Reading tablespace information from the .ibd files...
InnoDB: Operating system error number 13 in a file operation.
InnoDB: The error means mysqld does not have the access rights to
InnoDB: the directory.
InnoDB: File name .
InnoDB: File operation call: 'opendir'.
InnoDB: Cannot continue operation.


The fix was easy in this case. Just adding the following line in [mysqld] section of my.cnf file.

innodb_force_recovery = 6

* InnoDB is started in read only mode preventing users from performing INSERT, UPDATE, or DELETE operations.
* You can SELECT from tables to dump them, or DROP or CREATE tables even if forced recovery is used.
* If you know that a given table is causing a crash on rollback, you can drop it.
* You can also use this to stop a runaway rollback caused by a failing mass import or ALTER TABLE.
* You can kill the mysqld process and set innodb_force_recovery to 3 to bring the database up without the rollback.

Labels: ,


December 28, 2012

 

Auto backup

Auto MySQL Backup is a shell script that helps to automate the back process.

http://sourceforge.net/projects/automysqlbackup/

Here are 7 easy steps to create backup of any database.

1) connect to the server.

2) Download the package:
wget http://tinyurl.com/c52x3sh

3) Extract files:
tar xvfz automysqlbackup-v3.0_rc6.tar

4) make sure you are "root" and then install:
sh install.sh

5) create backup folder:
mkdir /var/backup/
# this backup location can be changed as shown below

6) Make changes to user name / password and the DB that needs to be backup:
vi /usr/local/bin/automysqlbackup
  CONFIG_mysql_dump_password='admin'
  CONFIG_db_names=('drupaldb')
  CONFIG_backup_dir='/var/backup/db'

7) Run the script to take the backup:
/usr/local/bin/automysqlbackup

Labels: , , ,


August 24, 2012

 

Online Utilities

Regular Expressions demystified

There are times when using regular expression is the only way to find the text from a given string.
Here is a really useful utility that will explain and at the same time help us to write complex regular expressions.

http://gskinner.com/RegExr/?31u07
_____

SQL formatter

The following utility will help to format a complex SQL query.

http://www.dpriver.com/pp/sqlformat.htm


Labels: ,


May 04, 2012

 

Copy MySQL data to Google Big Query

Here is a script that will dump the data from mysql and upload it to google cloud. It will then import the data from cloud to Google BigQuery database.
#!/bin/sh
yester_day=`date '+%Y%m%d' --date="1 day ago"`
schema_name="miniLogArchives"
tbl_name="r_mini_raw_$yester_day"
raw_location="/mnt/$yester_day"
field_term="^"
file_prefix='may3'
# split 4 GB
split_bytes="4000000000"
# google bucket name
bucket_name='log_data'

mkdir $raw_location
cd $raw_location
time mysqldump $schema_name $tbl_name --tab=$raw_location --fields-terminated-by="$field_term"

# make sure input file is UTF-8 file
file $tbl_name.txt

# split files with prefix 
time split -C 4000000000 $tbl_name.txt $file_prefix

for file in `ls $file_prefix*`
do
# big query does not seem to like double quotes
time sed -i 's/\"//' $file
time gzip $file

# copy to google cloud and then import data to big query

time gsutil cp $file.gz gs://$bucket_name/
time bq --nosync load -F '^' --max_bad_record=30000 mycompany.raw_data gs://$bucket_name/$file.gz ip:string,cb:string,country:string,telco_name:string, ...

done

exit

# make sure that mysql has write permission to raw_location path /mnt/
# make sure the 2 utilities are installed from ...
# https://developers.google.com/storage/docs/gsutil_install#install
# http://code.google.com/p/google-bigquery-tools/downloads/list


Here is another shell script that will copy the current MySQL table to big-query.

time mysql -e"drop table if exists test.google_query"

time mysql vserv -e"create table test.google_query engine=MyISAM
SELECT s.zone_id AS zone_id, ...
WHERE ncyid = 3 AND s.date_time>='2012-07-01 00:00:00' AND s.date_time<='2012-07-02 23:59:59' "

time mysqldump test google_query --tab=/mnt/
time gzip /mnt/google_query.txt

time bq --nosync load  -F '\t' --max_bad_record=30000 --job_id summary_2012_07_09_to_2012_07_02  company.summary02 /mnt/google_query.txt.gz zone_id:string,requests:integer,total_revenue:float,...

time bq query "SELECT FROM company.summary02 WHERE date_time>='2012-07-01 00:00:00' AND date_time<='2012-07-02 23:59:59' GROUP BY continent, country"

// this tutorial shows how to integrate big-query results into google spreadsheet.

https://developers.google.com/apps-script/articles/bigquery_tutorial

// gsutil can be used with both, google cloud as well as aws s3 provided you have added the aws_key and aws_secret_key to ~/.boto file:
gsutil -m cp -R s3://from_aws/ gs://daily_log/

// gsutil supports multi-thread option while s3cmd supports sync option as shown below:
s3cmd sync local_dir s3://from_aws/

// create JSON schema from MySQL save it as summary.json
select CONCAT('{"name": "', COLUMN_NAME, '","type":"', IF(DATA_TYPE like "%int%", "INTEGER",IF(DATA_TYPE="decimal","FLOAT","STRING")) , '"},') as json from information_schema.columns where TABLE_SCHEMA = 'test' AND TABLE_NAME = 'summary'

mysqldump test summary --tab=.

time bq load --nosync -F '\t' --max_bad_record=30000 company.ox_banners ./ox_banners.txt ./ox_banners.json

_____

Here is the script that will copy the MySQL table to Big Query table

#!/bin/sh
BIG_QUERY_DB='company'
# change the big query db variables and then
# run the script with mysql DB and Table name for e.g.
# sh -xv query.sh asgs1a_vol_e6d65888 ox_data_archive_r_20120727

if [ $# -eq 0 ]
  then
    echo "DB and table name required"
exit 1
fi

TABLE_SCHEMA=$1
TABLE_NAME=$2

cat > json_query.txt << heredoc
select CONCAT('{"name": "', COLUMN_NAME, '","type":"', IF(DATA_TYPE like "%int%", "INTEGER",IF(DATA_TYPE="decimal","FLOAT","STRING")) , '"},') as json from information_schema.columns where TABLE_SCHEMA = '$TABLE_SCHEMA' AND TABLE_NAME = '$TABLE_NAME';
heredoc

echo '[' >  $TABLE_NAME.json
mysql -Bs < json_query.txt | sed '$s/,$//' >> $TABLE_NAME.json
echo ']' >>  $TABLE_NAME.json


mysqldump $TABLE_SCHEMA $TABLE_NAME --tab=.
time sed 's/\"//' $TABLE_NAME.txt | gzip -c > $TABLE_NAME.txt.gz

mytime=`date '+%y%m%d%H%M'`
time bq load --nosync -F '\t' --job_id="$TABLE_NAME$mytime" --max_bad_record=30000 $BIG_QUERY_DB.$TABLE_NAME $TABLE_NAME.txt.gz $TABLE_NAME.json

# download the script
wget http://mysqldump.googlecode.com/files/big_query.sh

# if required use dos2unix new_query.sh command to make it in linux format
# now dump and transfer the data from table mytable of company database
sh -xv big_query.sh company mytable

# If you need to process the text files, use the following scirpt
wget http://mysqldump.googlecode.com/files/final_big_query.sh
 _____

The following script will remove all the data from google cloud.
Use it with utmost care!

#!/bin/sh
for mybucket in `gsutil ls`
do
gsutil rm -R $mybucket > /dev/null
gsutil rb $mybucket
done

Script to remove all tables from big query

#!/bin/sh
# can add -r -f to rm command
# if you want to forcefully remove all tables
# datasetId from the command "bq ls"
datasetId=company

for mytable in `bq ls $datasetId`
do
bq rm $datasetId.$mytable
done

#You were warned

Labels: , ,


Archives

June 2001   July 2001   January 2003   May 2003   September 2003   October 2003   December 2003   January 2004   February 2004   March 2004   April 2004   May 2004   June 2004   July 2004   August 2004   September 2004   October 2004   November 2004   December 2004   January 2005   February 2005   March 2005   April 2005   May 2005   June 2005   July 2005   August 2005   September 2005   October 2005   November 2005   December 2005   January 2006   February 2006   March 2006   April 2006   May 2006   June 2006   July 2006   August 2006   September 2006   October 2006   November 2006   December 2006   January 2007   February 2007   March 2007   April 2007   June 2007   July 2007   August 2007   September 2007   October 2007   November 2007   December 2007   January 2008   February 2008   March 2008   April 2008   July 2008   August 2008   September 2008   October 2008   November 2008   December 2008   January 2009   February 2009   March 2009   April 2009   May 2009   June 2009   July 2009   August 2009   September 2009   October 2009   November 2009   December 2009   January 2010   February 2010   March 2010   April 2010   May 2010   June 2010   July 2010   August 2010   September 2010   October 2010   November 2010   December 2010   January 2011   February 2011   March 2011   April 2011   May 2011   June 2011   July 2011   August 2011   September 2011   October 2011   November 2011   December 2011   January 2012   February 2012   March 2012   April 2012   May 2012   June 2012   July 2012   August 2012   October 2012   November 2012   December 2012   January 2013   February 2013   March 2013   April 2013   May 2013   June 2013   July 2013   September 2013   October 2013   January 2014   March 2014   April 2014   May 2014   July 2014   August 2014   September 2014   October 2014   November 2014   December 2014   January 2015   February 2015   March 2015   April 2015   May 2015   June 2015   July 2015   August 2015   September 2015   January 2016   February 2016   March 2016   April 2016   May 2016   June 2016   July 2016   August 2016   September 2016   October 2016   November 2016   December 2016   January 2017   February 2017   April 2017   May 2017   June 2017   July 2017   August 2017   September 2017   October 2017   November 2017   December 2017   February 2018   March 2018   April 2018   May 2018   June 2018   July 2018   August 2018   September 2018   October 2018   November 2018   December 2018   January 2019   February 2019   March 2019   April 2019   May 2019   July 2019   August 2019   September 2019   October 2019   November 2019   December 2019   January 2020   February 2020   March 2020   April 2020   May 2020   July 2020   August 2020   September 2020   October 2020   December 2020   January 2021   April 2021   May 2021   July 2021   September 2021   March 2022   October 2022   November 2022   March 2023   April 2023   July 2023   September 2023   October 2023   November 2023   April 2024   May 2024   June 2024   August 2024   September 2024   October 2024   November 2024   December 2024   January 2025   February 2025   April 2025   June 2025   July 2025   August 2025  

This page is powered by Blogger. Isn't yours?