Category Archives: Tech Hacks

Blog posts in this category are aimed at sharing tech hack that are useful in daily technology operations and help quickly solve or propose a solution for a technology problem using code or no-code paradigms.

Where Is My Last Name in CRM?

I have been recently talking about my tech hacks to solve day-2-day problems in technology using different programming approaches. This blog post is about cleaning up the name field within a customer resource.

I was given this challenge in context to dutch language and asked if there are remedies beyond usual grep or split commands in order to derive first name versus the last name from a field that currently only holds the full name with the last name field being blank!

It was an interesting problem to look at as you could leverage many approaches like deploying Mechanical Turk to specialized cleansing services but , I chose to go a different way .

I used following packages in python to write a small code in order to get my results:

  1. Probable People – An open source package maintained by Datamade.
  2. SpaCy –  An industrial strength Natural Language Processing(NLP) package

What is Probable People?

probablepeople is a python library for parsing unstructured romanized name or company strings into components, using advanced NLP methods. This is based off usaddress, a python library for parsing addresses.

Try it out on our web interface! For those who aren’t python developers.

What this can do: Using a probabilistic model, it makes (very educated) guesses in identifying name or corporation components, even in tricky cases where rule-based parsers typically break down.

What this cannot do: It cannot identify components with perfect accuracy, nor can it verify that a given name/company is correct/valid.

probablepeople learns how to parse names/companies through a body of training data. If you have examples of names/companies that stump this parser, please send them over! By adding more examples to the training data, probablepeople can continue to learn and improve.

What is SpaCy?

Spacy is an industrial strength NLP written in python and more can be fond on the site , it might not be worth me writing more about it here due to its popularity.

Whilst both packages provide powerful machine learning approaches to re-train , train and evaluate your machine learning model in context of the problem , I have taken an OOTB(Out Of The Box) approach to directly ingest data with available corpus and probabilistic parser.

Approach

In terms of approach , I have used a pipeline architecture where the same data is send across to both libraries and then reconciled for presentation in the output. In simple terms I have used CRF(Conditional Random Field) approach of ProbablePeople & Named Entity Recognition(NER) from SpaCy to construct a pipeline to achieve my objective.

Simple Workflow For Creating a Structured output for name parser

Following are some basic code snippets to help you understand simple workings within the code and assemble your own output.

#Installation Commands 
pip install probablepeople
pip install spacy
pip install xlrd
pip install pandas
...
#Using dutch corpus for spacy
python -m spacy download nl_core_news_sm
...
#import
import probablepeople as pp
import pandas as pd
import xlrd
import csv
import os.path
import spacy
from spacy.matcher import Matcher
import nl_core_news_sm
...
#load corpus
nlp = nl_core_news_sm.load()
...
#Clean-up functions
def _removeNumbers(s):
    # Python code to demonstrate 
    # how to remove numeric digits from string 
    # using join and isdigit 

    # using filter and lambda 
    # to remove numeric digits from string 
    res = "".join(filter(lambda x: not x.isdigit(), s)) 

    return res 

def _removePunctuation(s): 
    # punctuation marks 
    punctuations = r'''!()-[]{};:'"\,<>./?@#$%^&*_~'''
  
    # traverse the given string and if any punctuation 
    # marks occur replace it with null 
    for x in s.lower(): 
        if x in punctuations: 
            s = s.replace(x, "") 
  
    # Print string without punctuation 
    return s
  
def _removeNonAscii(s): return "".join(i for i in s if ord(i)<128)

...
#NER Functions
def _nerExtraction(s):
    doc = nlp(s)
    entity_collection = []
    for ent in doc.ents:
        entity = {}
        entity[ ent.label_] = ent.text
        entity_collection.append (entity)
    
    return  entity_collection

#Parser Function Call
 try:
     ordered_text = pp.tag(value)
 except pp.RepeatedLabelError as e :
      .....

Using single field input we got one or many fields in a structured manner as below in a csv file! During the exercise it was also interesting to see that every name was not a person but ended up being a company name !

    'ner_entity',
    'ner_type',
    'crf_type',
    'PrefixMarital',    
    'PrefixOther',
    'GivenName',
    'FirstInitial',
    'MiddleName',
    'MiddleInitial',
    'Surname',
    'LastInitial',
    'SuffixGenerational',
    'SuffixOther',
    'Nickname',
    'SecondGivenName',
    'SecondSurname',
    'And',
    'CorporationName',
    'CorporationNameOrganization',
    'CorporationLegalType',
    'CorporationNamePossessiveOf',
    'ShortForm',
    'ProxyFor',
    'AKA'

Using the above flow , I was able to clean-up and provide a simple automation to a CRM flow that can then be converted to an API and be able to provide value using open-source approach.

If there is any feedback or comments do let me via post comments!

Light-Code Data Integration With Zapier

Recently I started sharing my learning around various experiments in area of no-code & light-code. Previously I had written a blog post on no-code Airtable Integration for data collection & processing. This post is about an experiment that I did few weeks back for a Proof-Of-Concept to create tickets and search for users in Zendesk [ to many this should not need any introduction ]

In order to complete my Proof-Of-Concept , I divided my processing into four major blocks:

  • Data Entry
    • Leverages a simple app created using React & React Zapier Form
    • Deploys to a very a easy to use static web publishing platform using surge.sh
  • Data Collection & Mapping
    • Created a workflow step to collect & map data using Zapier
  • Triggers
  • Data Persistence
    • Created a workflow step to persist the processed information back into storage of choice
    • or Can also inspect the data using RequestBin

The over-all architecture flow would like somewhat like this :

flow

In terms of account set-up , you would need trial or entry level account set-up with following

  • Zapier
  • Zendesk
  • Surge.sh
  • RequestBin

In this experiment the dominant design pattern is around Zapier. As we walk through various blocks you would understand how different constructs of a Zap as Zapier calls it are at play.

Data Entry

Using a default React App , I integrated the react-zapier-form package [ details are provided above ] . This package helped me to quickly integrate with a catch-hook that was defined within the Zapier workflow which allows us to post the data from the react form to the catch-hook as a json payload.

</p>import ZapierForm from 'react-zapier-form'
 
...
 
<ZapierForm action='INSERT YOUR HOOK'>
   {({ error, loading, success }) => {
      return (
         <div>
            {!success && !loading &&
               <div>
                  <input type='email' name='Email' placeholder='Email' />
                  <textarea name='Message' placeholder='Your message' />
                  <button>Submit</button>
               </div>
            }
            {loading && <div>Loading...</div>}
            {error && <div>Something went wrong. Please try again later.</div>}
            {success && <div>Thank you for contacting us!</div>}
         </div>
      )
   }}
</ZapierForm><p class="has-text-align-justify">

Once this react app is ready for deployment , I always love to move away from localhost Proof-Of-Concept to a deployment in cloud experience , so leveraging surge.sh came very handy to that effect. Surge has been built from the ground up for native web application publishing and is committed to being the best way for Front-End Developers to put HTML5 applications into production.

& you can deploy for free for starters 🙂

</p>npm install -g surge
npm run build
cd build
mv index.html 200.html
surge<p class="has-text-align-justify">

The command sequence does as follows

  • Install surge
  • Build your React App
  • Rename index.html to 200.html [ If we don’t rename index.html, everything will work fine, but in case you have client side routing routing (maybe with React Router) and we navigate to a new route and refresh the page, we’ll get a 404 “page not found” error. Since many React projects implement client-side routing, I have included this step. If you aren’t using client-side routing, feel free to skip renaming the index.html file. Read more about adding a 200 page for client-side routing on the Surge help docs.
  • Now run the surge command , that’s it !

Data Mapping & Triggers

Zapier workflow construction is pretty straight forward and one can proceed very swiftly through the integration. As you can see that there is node based code to capture the response and then post back on a URL , which I grabbed from RequesBin to post the data.

Once the whole process runs end to end you can then see that a post of the processed information is available at the HTTP hook . One can similarly send this data to a persistent storage using Zapier as it has integrated to many popular persistence mechanism including queues.

One of the things you would see in the workflow schematic image and the workflow itself is the use of a request_id that is generated on client side and then floated across the processing pipeline for us to create trace all along Zapier workflow and then be able to get the result look-up using the same request_id. I used the uuid package to achieve this piece of GUID generation.

I hope people find this useful for their day-to-day problem statement around workflow automation and it provides them some more options on how to move steadily through some integration problems of connecting with different Apps because Zapier provides more than 1500+ integrations that can be useful to automate many tasks.

If you have any feedback or comments post back on the blog . Happy Reading !

No-Code Airtable Integration

I have been using Airtable for quite sometime now at RecipeDabba where I work as part-time co-founder and coder ! My wife Rakshita Dwivedi , is the actual consumer of my work.

Almost every feature that is described by Airtable helps to power light weight tech-support that for my wife’s 21-day challenges in multiple formats that helps promote her healthy eating philopshy for kids. This became ever more significant during pandemic as she shifted bulk of her work online.

The diagram which you see below has been architected is powered using Airtable to create a workflow based architecture:

Schematic flow – copyright – Recipedabba

Airtable is a versatile cloud based sheet / database solution that helps automate large part of light weight process through

  • Multiple data types
  • Formulae
  • Blocks
  • Forms

I use all of the above in combination to do multiple pieces in the workflow like

  • Basics
    • Table Creations
    • Views
  • Data Grouping
    • Use of filters , group by
  • Analytics & Derivations
    • Roll-up fields [ very power full feature ]
    • Formulae to derive new fields [ this was another awesome feature ]
  • Data Entry
    • Forms
  • Blocks
    • De-Dupe Checks
    • Charts
Chart Presentation of Data

You can see above how the table data is quickly transformed into a basic chart visulization.

De-dupe block to remove duplicate enteries

An awesome block to remove duplicate entries from the system , with few clicks and configurations.

Snippets from the form view

Form rendition on mobile and desktop is very nie . Since we started to use this , the mothers [ who are primary collector of informaiton on behalf of kids who particpate ] , have found it easy to fill information and send it back to us!

Formulae and Applications

We can work on top of the data and apply many conditioanlities , thus allowing a flexible viewing of data in real time. Some of these things can take coding effort while connecting with analytics but , first level aggregation and analytics on daily basis has been very easy to perform in Airtable.

Overall for a upcoming or very small set-up Airtable . If you want to know more about how to do things in Airtable , feel free to ping me via comments and I will see if I can help !

Deferred Objects & Arrays

Just documented some stuff around deferred objects and array…

Was coding just for fun and encountered the issue of not getting back the objects provided to the deferred’s resolve() method as jQuery calls the done() and fail()callbacks with individual parameters, not an array. That means we have to use the arguments pseudo-array to get all the resolved/rejected objects returned by the array of deferreds.

$.when.apply($,deferreds).then(function() {
var objects=arguments; // The array of resolved objects as a pseudo-array
...
};

Here is a solution inspired by when.js‘s when.all() method that addresses these problems:

// Put somewhere in your scripting environment
if (jQuery.when.all===undefined) {
    jQuery.when.all = function(deferreds) {
        var deferred = new jQuery.Deferred();
        $.when.apply(jQuery, deferreds).then(
            function() {
                deferred.resolve(Array.prototype.slice.call(arguments));
            },
            function() {
                deferred.fail(Array.prototype.slice.call(arguments));
            });

        return deferred;
    }
}

Now you can simply pass in an array of deferreds/promises and get back an array of resolved/rejected objects in your callback, like so:

$.when.all(deferreds).then(function(objects) {
    console.log("Resolved objects:", objects);
});

Refer : Pass in an array of Deferreds to $.when()