Search

Recent Posts

Tags

Google Nexus 4 Phone – What they don’t tell you…

By Dale Reagan | April 7, 2013

So, the first time around (early Android) I had a few comments about Google Android phones and privacy; seems that things are a bit better (at least at first) for the current generation (12/2012) of Android based systems but there are still privacy issues (IMO.)

Previous Android Phone posts

Based on a review of the previous posts from 2009 – there have been some fixes/updates from Android/Google-land during the past ~ 3 years but, overall, privacy is still-not-an-option.   Great phone for folks who don’t mind sharing literally everything in-the-cloud…  I’m exploring options for using some sort of local, personal-cloud – should be possible.  This time around I took-the-plunge and simply purchased the phone directly from Google (not an option previously.)

No Sync to PC Outlook data

– easy to put your data in the Google Cloud and then ‘sync’ to your phone – but, then your data is no longer private…  Workarounds:

Working with Funambol –  Contacts/Schedule Sync Server

I gave up on my first pass with this since the Demo did not work; I returned after not finding another solution (and finding the link above.)  While in general, the ‘community edition’ docs are ok, they are not-in-sync with what you see and what I had to do to get a working-sync solution.  I installed and configured:

I can update my calendar or contacats on any device below and then sync diffs from:

The Android Nexus 4 (newest phone from Google in 2012) does NOT support Outlook Notes or Tasks – see this post for more information on using  Funambol.

 Once you login to your Gmail account

– you cannot log out (since ‘services’ are tied to your email account.)  possible workarounds:

The only reason to ‘login’ is if/when you ‘need something’ (i.e. get/install something new.)  I have no desire to ‘share’ my personal info as I surf or use other ‘services’ – which are advertised as being ‘free’ – TOS aside I will guess that more users will raise concerns about this ‘approach’ (i.e. the 12/2012 incident w/Instagram and using images for advertising…)

Android Disk?

Export Contacts from Pop Email

No ‘file explorer’?

No real, on-device Schedule/Contact Management

Plan to Root your Nexus?

 Overall – Satisfied

At least, at this point.

Would be nice if:

I’m still exploring this device so I’m sure I’ll find more interesting features/quirks; as always, your signal may vary, at least a little bit. 🙂

Topics: Computer Technology, Internet Search, Mobile Web, System and Network Security, Unix-Linux-Os, Web Technologies | Comments Off on Google Nexus 4 Phone – What they don’t tell you…

Android Nexus 4 – using Sqlite to ’empty’ Pop Email ‘trash’

By Dale Reagan | April 7, 2013

Empty POP Email ‘trash’ (Android)

Ok, after some digging and a bit of thought:

Simple solutions:

  1. Don’t use Google Android Nexus 4 Pop email…
  2. Use some other Android email for Pop access…
  3. Learn a little Sqlite3 and put together a few ‘steps’ or a script to deal with this problem – that’s what I will explore later in this post.

In my previous post on this topic I provided the output of the CSV headers for the Sqlite3 Database files that appear to be used for Pop email on the Google Android Nexus 4.  Note that this issue/post is being used with Android 4.2.2 (in early 2013) so things could be different if you are using an older/newer version of the Android OS and tools.

Prerequisites:

  1. Working Android Phone w/working POP email setup AND/Or Working AVM (Android Virtual Machine)
  2. Android SDK with working ‘adb’ connection (I’m using USB)
  3. Linux (or other) system with a working install of Sqlite3
  4. copies of the Email database files for ‘testing’ (yep, gonna ‘test’ first…)

Pull your Email DB files to your PC

In my case the files are:

/data/data/com.google.android.email/databases/EmailProviderBody.db
/data/data/com.google.android.email/databases/EmailProvider.db

If you try to ‘pull’ these files you will (most likely) get an error:

$ adb pull /data/data/com.google.android.email/databases/EmailProvider.db
remote object ‘/data/data/com.google.android.email/databases/EmailProvider.db’ does not exist

Ok, I know that it exists – what if I try a file listing?

$ adb shell ls -l /data/data/com.google.android.email/databases/EmailProvider.db
/data/data/com.google.android.email/databases/EmailProvider.db: Permission denied

Ok, this makes sense – access is restricted – so how do I get a copy for testing?

Since this phone is ‘rooted’ I:

  1. connect via adb shell
  2. then use ‘su -‘
  3. copy the file to an ‘open’ location
  4. exit su & adb
  5. now copy the file(s) via adb pull

Wait, wait – what about ‘adb root’?  Here is what I get – it suggests that you would be fine working on your AVD but guessing it would NOT work on your Android phone:

$ adb root
adbd cannot run as root in production builds

Adb example session to place files in a ‘readable’ location:

$ adb shell
shell@android:/ $ su -
root@android:/ # cp -v /data/data/com.google.android.email/databases/EmailProvider.db /sdcard/temp/EmailProvider.db
root@android:/ # exit
shell@android:/ $ exit


$ adb pull /sdcard/temp/EmailProvider.db .
2699 KB/s (1093632 bytes in 0.395s)


$ ls -l EmailProvider.db
-rw-r--r-- 1 dale dale 1093632 Mar 31 10:49 EmailProvider.db

Now cleaup/remove the file(s) from ~/temp folder:

$ adb shell rm /sdcard/temp/EmailProvider.db


Ok, time to review the table structures and figure out what to delete/remove/save.  One item of concern:

POP email passwords are stored in ‘plain text’.

This may be a non-issue for most folks but I don’t think that it is a ‘best practice’ for any Application to use ‘plain text’ for any type of password storage – JMO.  I’m sure folks will argue that since the DB file is in ‘restricted space’ that this is not a concern – I’m not buying it (and I don’t have a solution other than to strongly suggest that any sensitive data should be encrypted – and a suggestion that you don’t use your ‘phone email’ as your ‘primary email’ and don’t use a cell phone if you are concerned about privacy and/or security – cell phones, cell phone networks and most likely, cell phone applications are NOT secure or private…)

Ok, this seems to work but is not confirmed (yet) – this is a one-way op so backups are suggested…

Caveats – this method does NOT deal with email messages that have attachments!  If you take this approach then you will most likely have ‘remnant’ files laying about.  First, you have to determine which ‘mailbox’ to delete from:

$ sqlite3 EmailProvider.db
SQLite version 3.7.11 2012-03-20 11:35:50
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .mode csv
sqlite> select * from Mailbox;
6,INBOX,INBOX,,-1,2,0,0,,0,0,0,6,1,24,25,,6,1364663778896,0,0,0,0,0,4347
7,Drafts,Drafts,,-1,2,3,0,,0,-1,0,0,1,8,25,,3,1357505032257,0,0,0,0,0,0
8,Outbox,Outbox,,-1,2,4,0,,0,-1,0,0,1,8,25,,0,1363452298650,0,0,0,0,0,0
9,Sent,Sent,,-1,2,5,0,,0,-1,0,0,1,8,25,,5,1357505032257,0,0,0,0,0,0
10,Trash,Trash,,-1,2,6,0,,0,-1,0,1266,1,8,25,,1795,1364656359354,0,0,0,0,0,0

The fields for this table (and others) are listed in a  previous post – to just review the ‘Mailbox’ table:

$ sqlite3 EmailProvider.db
SQLite version 3.7.11 2012-03-20 11:35:50
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .mode line
sqlite> select * from Mailbox where _id == 10;
                     _id = 10
             displayName = Trash
                serverId = Trash
          parentServerId = 
               parentKey = -1
              accountKey = 2
                    type = 6
               delimiter = 0
                 syncKey = 
            syncLookback = 0
            syncInterval = -1
                syncTime = 0
             unreadCount = 1266
             flagVisible = 1
                   flags = 8
            visibleLimit = 25
              syncStatus = 
            messageCount = 1795
         lastTouchedTime = 1364656359354
            uiSyncStatus = 0
        uiLastSyncResult = 0
  lastNotifiedMessageKey = 0
lastNotifiedMessageCount = 0
              totalCount = 0
      lastSeenMessageKey = 0
Ok, the serverId 'key' may make more sense - but what you are after is the key value
that you can use with the Message table.
sqlite> select * from Mailbox where serverId == "Trash";
10,Trash,Trash,,-1,2,6,0,,0,-1,0,1266,1,8,25,,1795,1364656359354,0,0,0,0,0,0

Delete data from email database files – this is a one-way ticket! so make sure you are sure!  Note that we are deleting from the Message table but using

$ sqlite3 EmailProvider.db

sqlite> Delete from Message where mailboxKey == 10 ;
sqlite> .exit

and now we remove any ‘body’ table entry that has a value greater than zero.

$ sqlite3 EmailBody.db

sqlite> Delete from Body where messageKey > 0 ;
sqlite> .exit

Retain ‘Empty’ Email Database files to ‘reset’

Provided that Android POP Email data/files are not stored in other files/locations you can ‘reset’ your Android POP email by simply copying the ’empty’ database files (that contain your POP email setup info) over to replace your existing, real-time, POP email database files.

Caveats

There *should be* a simpler way to deal with the Android POP email Trash issue but I have not found it.  When I started this post I was thinking that I might use a script to copy the ’empty’ files into place but I have not tested this yet – perhaps someone will provide a ‘better’ way before I get a script tested/online.

If you have/use more than one POP email account then this approach may need some tweaking; I will guess that the second POP email account would use a ‘Mailbox ID’ that was different (i.e. a higher number/value.)

Another approach – use the ‘system steps’ to ‘delete an account’ but then add them back by using Sqlite3 commands [NOTE – I did NOT test this so caution is advised…], i.e. after clearing the POP email database files you create two entries Per POP email account – one for reading email and one for sending email:

INSERT INTO HostAuth VALUES(3,’pop3′,’pop.YOUR-Domain.yyy’,110,14,’EMAIL_USER’,’READ_EMAIL_PW’,NULL,0,NULL);
INSERT INTO HostAuth VALUES(4,’smtp’,’smtp.YOUR-Domain.yyy’,587,14,’EMAIL_USER’,’SEND_EMAIL_PW’,NULL,0,NULL);

<< HostAuth – Sqlite3 table structure>>
1 | CREATE TABLE HostAuth (_id integer primary key autoincrement
2 |  protocol text
3 |  address text
4 |  port integer
5 |  flags integer
6 |  login text
7 |  password text
8 |  domain text
9 |  accountKey integer
10 | certAlias text);

Other thoughts – there may also already be Sqlite3 tools in the Android ‘store’ that allow you to edit the database files on your Android device – that would probably be ‘better’ than the approaches discussed above.

Even better – perhaps the *next* POP email client for Android will provide more comprehensive data-file management and not leave you with the potential to fill your file system with ‘trash emails’…

Have you created/found some other, better solution? – please share.

As always, your mileage should vary – at  least a bit.  🙂

Topics: Computer Technology, Mobile Web, Reviews|Comments|Feedback, Unix-Linux-Os, Web Problem Solving | Comments Off on Android Nexus 4 – using Sqlite to ’empty’ Pop Email ‘trash’

Android Nexus 4 – Still can’t ’empty trash’…

By Dale Reagan | March 30, 2013

I was annoyed enough by ‘pop email’ issues during my first pass at Android that I simply returned the device.

As mentioned previously, Google hardware is Google-Centric and, most likely, will force you to use other Google products & services – if you want to use Google hardware in an independent manner then you have to figure out how to get some things to ‘work as expected’.

So, Pop emails – you put them in the ‘trash’ and forget about them.  The only ’email setup’ option (that I found) covering email deletion was to:

I took that to mean that my ‘trash folder’ would also be emptied at each email ‘refresh’.  I recently discovered that this is NOT how things ‘work’ – with a few thousand emails in my ‘trash’ folder I am now searching for a solution.

How to really ’empty’ the trash folder for POP Email on the Nexus 4

Delete every message twice, i.e.

  1. read/open message
  2. select trash icon
  3. open trash folder
  4. select/open message again and select trash icon again

Ok, I suppose that this is a step ‘up’ from not being able to delete/remove emails at all (which was the case only a few years ago.)

The really fast way to ‘clear’ the trash folder for POP Email on the Nexus 4

  1. delete the email account using Pop email
  2. re-create the pop email account.

Other options?

  1. Don’t use the Google Android Pop email software?
  2. Use an IMAP instead of Pop email account?
  3. Use a third party Pop email solution (free to $$.)

Hmm, still less than ideal, yes?

What to do?  Research, of course!

  1. It’s been several months so I download & install the latest Android SDK
  2. I put my Nexus phone in ‘debug’ mode
  3. I connect from my Linux system to the Nexus 4 using ‘adb’ (Android Debugger.)
  4. I create a ‘long file’ listing (i.e. ls -l for ALL  files on the device; this requires some tool tinkering but I get it done.)
    • connect via USB
    • adb shell
    • su –
    • search-for-files
    • exit; exit
    • adb pull /some_path/list.of.files.txt
  5. Now I can search my file list for ’email’ – here are the files that I think will be for Pop email:
    • /data/data/com.google.android.email/databases/EmailProviderBackup.db
    • /data/data/com.google.android.email/databases/EmailProviderBody.db
    • /data/data/com.google.android.email/databases/EmailProvider.db
  6. Ok, these are Sqlite3 DB files so if I can get them on my Linux system I can ‘work’ with them (noting that you should also be able to work on them directly on the Nexus phone if you don’t mind the limitations of adb….)
  7. After reviewing the Sqlite3 tables I decide to search the ‘Android Docs’ for a possible solution – the simplest solution does seem to be to delete/re-create the email account and do ‘double deletes’ going forward; there should be a better way so I will have to dig.

Android uses Sqlite3

Sqilte3 is a relatively simple database solution – Sqlite3 Basics (really simple):

$ sqlite3 EmailProviderBody.db
SQLite version 3.7.11 2012-03-20 11:35:50
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .help
.backup ?DB? FILE      Backup DB (default "main") to FILE
.bail ON|OFF           Stop after hitting an error.  Default OFF
.databases             List names and files of attached databases
.dump ?TABLE? ...      Dump the database in an SQL text format
                         If TABLE specified, only dump tables matching
                         LIKE pattern TABLE.
.echo ON|OFF           Turn command echo on or off
.exit                  Exit this program
.explain ?ON|OFF?      Turn output mode suitable for EXPLAIN on or off.
                         With no args, it turns EXPLAIN on.
.header(s) ON|OFF      Turn display of headers on or off
.help                  Show this message
.import FILE TABLE     Import data from FILE into TABLE
.indices ?TABLE?       Show names of all indices
                         If TABLE specified, only show indices for tables
                         matching LIKE pattern TABLE.

.load FILE ?ENTRY?     Load an extension library
.log FILE|off          Turn logging on or off.  FILE can be stderr/stdout
.mode MODE ?TABLE?     Set output mode where MODE is one of:
                         csv      Comma-separated values
                         column   Left-aligned columns.  (See .width)
                         html     HTML <table> code
                         insert   SQL insert statements for TABLE
                         line     One value per line
                         list     Values delimited by .separator string
                         tabs     Tab-separated values
                         tcl      TCL list elements
.nullvalue STRING      Print STRING in place of NULL values
.output FILENAME       Send output to FILENAME
.output stdout         Send output to the screen
.prompt MAIN CONTINUE  Replace the standard prompts
.quit                  Exit this program
.read FILENAME         Execute SQL in FILENAME
.restore ?DB? FILE     Restore content of DB (default "main") from FILE
.schema ?TABLE?        Show the CREATE statements
                         If TABLE specified, only show tables matching
                         LIKE pattern TABLE.
.separator STRING      Change separator used by output mode and .import
.show                  Show the current values for various settings
.stats ON|OFF          Turn stats on or off
.tables ?TABLE?        List names of tables
                         If TABLE specified, only list tables matching
                         LIKE pattern TABLE.
.timeout MS            Try opening locked tables for MS milliseconds
.vfsname ?AUX?         Print the name of the VFS stack
.width NUM1 NUM2 ...   Set column widths for "column" mode
.timer ON|OFF          Turn the CPU timer measurement on or off
sqlite>

Ok, crank up Sqlite 3, set your output file and mode and follow with an export (the ‘dump’ command).

$ sqlite3 EmailProviderBody.db
SQLite version 3.7.11 2012-03-20 11:35:50
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .output test-out.body.csv
sqlite> .mode csv
sqlite> .dump
sqlite> 

$ ls -l test-out.body.csv 
-rw-rw-r-- 1 dale dale 11277187 Mar 30 20:01 test-out.body.csv

So, guessing a bit – looks like Pop email uses two DB files – one for ‘providers’ (i.e. where your ISP & message ‘summary’ data is held) and the ‘body’ DB that contains the details of the message ‘body’; I am also guessing that ‘attachments’ may also be stored/present in such data; I extract the ‘CREATE TABLE’ commands and generate the table structures below.

 

The body db file structure:   Def # Flds:     9 (based on header from 1st line of CSV file)
  Fld | Description 
------|--------------------------------------
        << android_metadata >>
  1 | CREATE TABLE android_metadata (locale TEXT);

        << Body >>
  1 | CREATE TABLE Body (_id integer primary key autoincrement
  2 |  messageKey integer
  3 |  htmlContent text
  4 |  textContent text
  5 |  htmlReply text
  6 |  textReply text
  7 |  sourceMessageKey text
  8 |  introText text
  9 |  quotedTextStartPos integer);
The provider db structure: Def # Flds:    24 (based on header from 1st line of CSV file)

Fld | Description
——|—————————————————
<< android_metadata >>
1 | CREATE TABLE android_metadata (locale TEXT);

<< Message >>
1 | CREATE TABLE Message (_id integer primary key autoincrement
2 | syncServerId text
3 | syncServerTimeStamp integer
4 | displayName text
5 | timeStamp integer
6 | subject text
7 | flagRead integer
8 | flagLoaded integer
9 | flagFavorite integer
10 | flagAttachment integer
11 | flags integer
12 | clientId integer
13 | messageId text
14 | mailboxKey integer
15 | accountKey integer
16 | fromList text
17 | toList text
18 | ccList text
19 | bccList text
20 | replyToList text
21 | meetingInfo text
22 | snippet text
23 | protocolSearchInfo text
24 | threadTopic text);

<< Message_Updates >>
1 | CREATE TABLE Message_Updates (_id integer unique
2 |  syncServerId text
3 |  syncServerTimeStamp integer
4 |  displayName text
5 |  timeStamp integer
6 |  subject text
7 |  flagRead integer
8 |  flagLoaded integer
9 |  flagFavorite integer
10 |  flagAttachment integer
11 |  flags integer
12 |  clientId integer
13 |  messageId text
14 |  mailboxKey integer
15 |  accountKey integer
16 |  fromList text
17 |  toList text
18 |  ccList text
19 |  bccList text
20 |  replyToList text
21 |  meetingInfo text
22 |  snippet text
23 |  protocolSearchInfo text
24 |  threadTopic text);

<< Message_Deletes >>
1 | CREATE TABLE Message_Deletes (_id integer unique
2 |  syncServerId text
3 |  syncServerTimeStamp integer
4 |  displayName text
5 |  timeStamp integer
6 |  subject text
7 |  flagRead integer
8 |  flagLoaded integer
9 |  flagFavorite integer
10 |  flagAttachment integer
11 |  flags integer
12 |  clientId integer
13 |  messageId text
14 |  mailboxKey integer
15 |  accountKey integer
16 |  fromList text
17 |  toList text
18 |  ccList text
19 |  bccList text
20 |  replyToList text
21 |  meetingInfo text
22 |  snippet text
23 |  protocolSearchInfo text
24 |  threadTopic text);

<< Attachment >>
1 | CREATE TABLE Attachment (_id integer primary key autoincrement
2 |  fileName text
3 |  mimeType text
4 |  size integer
5 |  contentId text
6 |  contentUri text
7 |  messageKey integer
8 |  location text
9 |  encoding text
10 |  content text
11 |  flags integer
12 |  content_bytes blob
13 |  accountKey integer
14 |  uiState integer
15 |  uiDestination integer
16 |  uiDownloadedSize integer);

<< Mailbox >>
1 | CREATE TABLE Mailbox (_id integer primary key autoincrement
2 |  displayName text
3 |  serverId text
4 |  parentServerId text
5 |  parentKey integer
6 |  accountKey integer
7 |  type integer
8 |  delimiter integer
9 |  syncKey text
10 |  syncLookback integer
11 |  syncInterval integer
12 |  syncTime integer
13 |  unreadCount integer
14 |  flagVisible integer
15 |  flags integer
16 |  visibleLimit integer
17 |  syncStatus text
18 |  messageCount integer not null default 0
19 |  lastTouchedTime integer default 0
20 |  uiSyncStatus integer default 0
21 |  uiLastSyncResult integer default 0
22 |  lastNotifiedMessageKey integer not null default 0
23 |  lastNotifiedMessageCount integer not null default 0
24 |  totalCount integer
25 |  lastSeenMessageKey integer);

<< HostAuth >>
1 | CREATE TABLE HostAuth (_id integer primary key autoincrement
2 |  protocol text
3 |  address text
4 |  port integer
5 |  flags integer
6 |  login text
7 |  password text
8 |  domain text
9 |  accountKey integer
10 | certAlias text);

<< Account >>
1 | CREATE TABLE Account (_id integer primary key autoincrement
2 |  displayName text
3 |  emailAddress text
4 |  syncKey text
5 |  syncLookback integer
6 |  syncInterval text
7 |  hostAuthKeyRecv integer
8 |  hostAuthKeySend integer
9 |  flags integer
10 |  isDefault integer
11 |  compatibilityUuid text
12 |  senderName text
13 |  ringtoneUri text
14 |  protocolVersion text
15 |  newMessageCount integer
16 |  securityFlags integer
17 |  securitySyncKey text
18 |  signature text
19 |  policyKey integer
20 |  notifiedMessageId integer
21 |  notifiedMessageCount integer);

<< Policy >>
1 | CREATE TABLE Policy (_id integer primary key autoincrement
2 |  passwordMode integer
3 |  passwordMinLength integer
4 |  passwordExpirationDays integer
5 |  passwordHistory integer
6 |  passwordComplexChars integer
7 |  passwordMaxFails integer
8 |  maxScreenLockTime integer
9 |  requireRemoteWipe integer
10 |  requireEncryption integer
11 |  requireEncryptionExternal integer
12 |  requireManualSyncRoaming integer
13 |  dontAllowCamera integer
14 |  dontAllowAttachments integer
15 |  dontAllowHtml integer
16 |  maxAttachmentSize integer
17 |  maxTextTruncationSize integer
18 |  maxHTMLTruncationSize integer
19 |  maxEmailLookback integer
20 |  maxCalendarLookback integer
21 |  passwordRecoveryEnabled integer
22 |  protocolPoliciesEnforced text
23 |  protocolPoliciesUnsupported text);

<< QuickResponse >>
1 | CREATE TABLE QuickResponse (_id integer primary key autoincrement
2 |  quickResponse text
3 |  accountKey integer);

I’m working on a post where I use Sqlite3 to ‘erase’ messages from the ‘trash’ folder.  It’s is not as simple as it should be, but i tmay be less hassle than deleting/creating your POP email over and over and over and over…

Topics: Computer Technology, Unix-Linux-Os | Comments Off on Android Nexus 4 – Still can’t ’empty trash’…

Linux Contact/Schedule/More Device Sync with Funambol

By Dale Reagan | January 15, 2013

This post is part of my review of the Nexus 4 (2012) phone from Google…

I may provide more details about using Funambol in a future post…

Funambol – a ‘private cloud’ (or cloud) based solution; in 12/2012 this does not appear to work out-of-the-box guessing due to Tomcat-network-related-features (i.e. seems that the download includes ‘remnant files’, which, if you remove may allow for the solution to work (below, steps from the referenced post, actual commands will vary based on install/OSt.)

* stop funambol server
* delete dir /opt/Funambol/tools/tomcat/webapps/funambol
* delete dir /opt/Funambol/tools/tomcat/work
* delete dir /opt/Funambol/tools/tomcat/temp
* start funambol server

Working with Funambol –  Contacts/Schedule Sync Server

I gave up on my first pass with this since the Demo did not work; I returned after not finding another solution (and finding the link above about removing certain files..)  While in general, the ‘community edition’ docs are ok, they are not-in-sync with what you see and what I had to do to get a working-sync solution.  I installed and configured:

I can update my calendar or contacts on any device below and then sync diffs from:

The Android Nexus 4 (newest phone from Google in 2012) does NOT seem to natively support Outlook Notes or Tasks.  Also, you may want to change the default port from ‘8080’ to something less common as well as creating firewall rules to limit access.  While you could implement this solution ‘in the cloud’ or ‘on the net’ I suggest a simple, internal network setup (at least, to start.)

Funambol File downloads – 2011/06 version appears to be latest (12/2012)

Full solution

Pull the Source via SVN:

Extending to the Enterprise
 As always, your mileage may vary.  :)

 

Topics: Computer Technology, Mobile Web, System and Network Security, Unix-Linux-Os, Web Technologies | Comments Off on Linux Contact/Schedule/More Device Sync with Funambol

More Awk with CSV/DSV files

By Dale Reagan | January 13, 2013

CSV/DSV Basics

Why Use Awk?

This post will cover basics for using CSV/DSV data with Awk (under Linux). If your need is immediate then one solution might be a toolbox tool like CSVFix – from Google Code – it has these features:

Awk – Basics for working with DSV/CSV Data

While there are differences between Awk implementations across various operating systems, for the most part, the approaches discussed here should be portable (work for any implementation of awk.)  Consult your OS  man pages for local quirks (i.e. man awk.)  Some caveats for this post:

awk -F “,” ‘{print}’  some_file.csv   ## this will simply print whatever is in your input file

awk -F “,” ‘{print $2}’  some_file.csv   ## only print the 2nd field

awk -F “,” ‘{printf “%s | %s\n”, NR, $2}’  some_file.csv   ## only print the line number and the 2nd field

awk -F “,” ‘{printf “%s | %s\n”, NR, $0}’  some_file.csv   ## only print the line number and the entire record (all fields)

Some Dummy CSV Data

The data below contains a Header line, 6 ‘good’ data entries and one ‘bad’ entry.  The last entry is ‘bad’ because the data contains too many commas.  Finding these types of ‘problem lines’ in thousands of lines of data can be tricky –  using just a little Awk code we can identify the ‘bad lines’ and either exclude them from output or edit the data and re-process.

Title,First Name,Middle Name,Last Name
ContactOne,Entry1,Middle1,Last_1
Contacttwo,Entry2,Middle2,Last_2
ContactThree,Entry3,Middle3,Last_3
ContactFour,Entry4,Middle4,Last_4
ContactFive,Entry5,Middle5,Last_5
ContactSix,Entry6,Middle6,Last_6
Dr.,SomeDoc, B. Bubba SomeDoc, D.M.D, FAGD,123 Some Street


The remainder of this post will explore functions from a Bash script that execute small Awk programs.


### this function prints:
# the Default Field count (based on the header line)
# the field count per record
# a line for EACH field
# an Error message if data contains too many/few fields
function simple_csv_1 {
stars Begin_${NUM_LINES}_lines
head -${NUM_LINES} ${IN_FILE} | \
awk -F "," '{
        if(NR == 1)
                DEF_NF=NF
        printf "NF/Def_NF %d/%d\n", NF,DEF_NF

        for (FLD=1 ; FLD <= DEF_NF ; FLD++)
                printf "\tRec: %3d | Fld: %3d | %s\n", NR, FLD, $FLD

        if(NF != DEF_NF) printf "Too many fields for record: %d\n\t%s\n", NR, $0
        }'
}

Here is the output (using the sample data above):
        ### Begin_10_lines ###
NF/Def_NF 4/4
        Rec:   1 | Fld:   1 | Title
        Rec:   1 | Fld:   2 | First Name
        Rec:   1 | Fld:   3 | Middle Name
        Rec:   1 | Fld:   4 | Last Name
NF/Def_NF 4/4
        Rec:   2 | Fld:   1 | ContactOne
        Rec:   2 | Fld:   2 | Entry1
        Rec:   2 | Fld:   3 | Middle1
        Rec:   2 | Fld:   4 | Last_1 
NF/Def_NF 4/4
        Rec:   3 | Fld:   1 | Contacttwo
        Rec:   3 | Fld:   2 | Entry2
        Rec:   3 | Fld:   3 | Middle2
        Rec:   3 | Fld:   4 | Last_2 
NF/Def_NF 4/4
        Rec:   4 | Fld:   1 | ContactThree
        Rec:   4 | Fld:   2 | Entry3
        Rec:   4 | Fld:   3 | Middle3
        Rec:   4 | Fld:   4 | Last_3 
NF/Def_NF 4/4
        Rec:   5 | Fld:   1 | ContactFour
        Rec:   5 | Fld:   2 | Entry4
        Rec:   5 | Fld:   3 | Middle4
        Rec:   5 | Fld:   4 | Last_4 
NF/Def_NF 4/4
        Rec:   6 | Fld:   1 | ContactFive
        Rec:   6 | Fld:   2 | Entry5
        Rec:   6 | Fld:   3 | Middle5
        Rec:   6 | Fld:   4 | Last_5 
NF/Def_NF 4/4
        Rec:   7 | Fld:   1 | ContactSix
        Rec:   7 | Fld:   2 | Entry6
        Rec:   7 | Fld:   3 | Middle6
        Rec:   7 | Fld:   4 | Last_6 
NF/Def_NF 6/4
        Rec:   8 | Fld:   1 | Dr.
        Rec:   8 | Fld:   2 | SomeDoc
        Rec:   8 | Fld:   3 |  B. Bubba SomeDoc
        Rec:   8 | Fld:   4 |  D.M.D
Too many fields for record: 8
        Dr.,SomeDoc, B. Bubba SomeDoc, D.M.D, FAGD,123 Some Street

The next function only prints records with ‘errors’.

### this function looks for 'errors' in the CSV data too many/few fields based on FS
function simple_csv_2 {
stars Begin_${NUM_LINES}_lines_Print_Err_Only
head -${NUM_LINES} ${IN_FILE} | \
awk -F "${D_FS}" -v D_FS="${D_FS}" '{
        if(NR == 1)
                DEF_NF=NF

        if(NF != DEF_NF) {
                ERR_CNT+=1
                printf "Should have: %d but found: %d Fields using delimiter: (%s).\n", DEF_NF, NF, FS
                printf "Too many fields for record: %d\n\t%s\n", NR, $0
                for (FLD=1 ; FLD <= DEF_NF ; FLD++)
                        printf "Rec: %3d | Fld: %3d | %s\n", NR, FLD, $FLD }
        }
        END {
                printf "\n\tError count: %d\n", ERR_CNT
        }'
}

The output:

        ### Begin_10_lines_Print_Err_Only ###
Should have: 4 but found: 6 Fields using delimiter: (,).
Too many fields for record: 8
        Dr.,SomeDoc, B. Bubba SomeDoc, D.M.D, FAGD,123 Some Street
Rec:   8 | Fld:   1 | Dr.
Rec:   8 | Fld:   2 | SomeDoc
Rec:   8 | Fld:   3 |  B. Bubba SomeDoc
Rec:   8 | Fld:   4 |  D.M.D

        Error count: 1

Next up, a function to show some ‘stats’ for the CSV data.

### this function uses an array and
### shows 'stats' for the CSV data, counts by fields and CSV Error count
function simple_array_1 {
stars Begin_${NUM_LINES}_lines_Simple_Array
head -${NUM_LINES} ${IN_FILE} | \
awk -F "${D_FS}" -v D_FS="${D_FS}" \
    '{
        if(NR == 1) {
                DEF_NF=NF
                for (FLD=1 ; FLD <= DEF_NF ; FLD++)
                        csv_stats[FLD, 0] = $FLD
        }
        for (FLD=1 ; FLD <= DEF_NF ; FLD++) {
                if($FLD != "")
                        csv_stats[FLD, 1] += 1
        }
        if(NF != DEF_NF) {
                NUM_ERR++
        }
    }
        END {
        printf "\n\n"
        printf " Def # Flds: %5d (based on header from 1st line of CSV file)\n", DEF_NF
        printf " Fld Errors: %5d (too many/few Fields)\n", NUM_ERR
        printf " Record Cnt: %5d\n", NR
        printf "\n"
        printf "  Fld | Description          | Non-null\n"
        printf "------|----------------------|-------------\n"
        for (FLD=1 ; FLD <= DEF_NF ; FLD++)
                        printf "%5d | %-20s | %5d\n", FLD, csv_stats[FLD, 0], csv_stats[FLD, 1]
        printf "\n"
        }'
}

The Output:

        ### Begin_10_lines_Simple_Array ###

 Def # Flds:     4 (based on header from 1st line of CSV file)
 Fld Errors:     1 (too many/few Fields)
 Record Cnt:     8

  Fld | Description          | Non-null
------|----------------------|-------------
    1 | Title                |     8
    2 | First Name           |     8
    3 | Middle Name          |     8
    4 | Last Name            |     8

The final function simply adds a new array element and store the current line of data; then the script ends this element contains the last line of data.

### this function shows 'stats' for the CSV data,
### counts by fields and CSV Error count AND
### includes the last data set
function simple_array_2 {
stars Begin_${NUM_LINES}_lines_Array_2
cat ${IN_FILE} | \
awk -F "${D_FS}" -v D_OFS="${D_FS}" \
    '{
        if(NR == 1) {
                DEF_NF=NF
                for (FLD=1 ; FLD <= DEF_NF ; FLD++)
                        csv_stats[FLD, 0] = $FLD
        }
        for (FLD=1 ; FLD <= DEF_NF ; FLD++) {
                if($FLD != "")
                        csv_stats[FLD, 1] += 1
                        csv_stats[FLD, 2] = $FLD
        }
        if(NF != DEF_NF) {
                NUM_ERR++
        }
    }
        END {
        printf "\n\n"
        printf " Def # Flds: %5d (based on header from 1st line of CSV file)\n", DEF_NF
        printf " Fld Errors: %5d (too many/few Fields)\n", NUM_ERR
        printf " Record Cnt: %5d\n", NR
        printf "\n"
        printf "  Fld | Description          | Non-null | Last Data\n"
        printf "------|----------------------|----------|-----------------\n"
        for (FLD=1 ; FLD <= DEF_NF ; FLD++)
                        printf "%5d | %-20s | %8d | %s\n", FLD, csv_stats[FLD, 0], csv_stats[FLD, 1], csv_stats[FLD, 2]
        printf "\n"
        }'
}

Bash Script: simple.csv.awk

I combined all of the above into a Bash script called simple.csv.awk – it includes a small function to ‘print stars’ and uses both Cat and Head commands to ‘send’ (pipe) data to Awk.

Some command line Awk to try (convert delimiters on the fly…)

cat small.csv |  awk 'BEGIN { FS=","; OFS="\t"} {print $1,$2,$3}'
cat small.csv |  awk 'BEGIN { FS=","; OFS=":"} {print $1,$2,$3}'
cat small.csv |  awk 'BEGIN { FS=","; OFS="|"} {print $1,$2,$3}'
cat small.csv |  awk 'BEGIN { FS=","; OFS="|||"} {print $1,$2,$3}'

Topics: Computer Technology, Problem Solving, Unix-Linux-Os | Comments Off on More Awk with CSV/DSV files


« Previous Entries Next Entries »

________________________________________________
YOUR GeoIP Data | Ip: 73.21.121.1
Continent: NA | Country Code: US | Country Name: United States
Region: | State/Region Name: | City:
(US only) Area Code: 0 | Postal code/Zip:
Latitude: 38.000000 | Longitude: -97.000000
Note - if using a mobile device your physical location may NOT be accurate...
________________________________________________

Georgia-USA.Com - Web Hosting for Business
____________________________________