General, Technical

Privacy: A How-To

Introduction

With the leak of classified NSA documents and their entailing revelations, Edward Snowden has become a household name. He single-handedly caused millions of people to rethink their electronic lives – and their assumptions of privacy. Now, those people (and businesses) are scrambling to find solutions to a problem they didn’t know existed, or chose to remain blissfully unaware, a number of months ago.

There have been numerous blog posts and documents about enhancing your systems to increase privacy protection, and I thought that I would summarize many of them from the perspective of someone who works in the industry. The sections of this article are organized in order of complexity (and tinfoil hattiness). The easiest and most basic measures will be in section 1 while the most complex and restrictive measures will be in the last.

Continue reading

Standard
Programming, Technical

Git: List all branches, who made them, and when they were created

I ran into an interesting problem today that took a bit longer to figure out, and so I will post it here in the hopes that others will find the solution faster than I did.

My problem: I am inspecting a rather large and complex piece of software and need to look at its history – more specifically, I need to figure out when certain branches were created and who created them.

Now, I could have just hacked together a throw-away solution, but that’s not who I am. As I was looking, I found that there didn’t seem to be much on this – aside from the Stack Overflow questions.

So I whipped up a little script that gets all of the branches in the git repo and lists the details of their first commit. These details include the date, and creator – which is what I needed. (Bash script)

#
# Author: Caleb Shortt
#
# For some reason, lists the files and directories'
# history as well...(They're not all branches)
#
# Sources:
# Stack Overflow: Question: "What is the easiest/fastest way to find out when
# a git branch was created"
#

readarray -t branch_list <<< $(git branch -a)

for branch in ${branch_list[@]}
do
    echo '---------------------------------------------'
    echo "Branch: $branch"
    echo '---------------------------------------------'
    git show --summary `git merge-base master $branch`
done

Now there is a little hiccup with the script also printing out the details of whatever is in the directory, but the details for the branches are in there. I’ll update this if I figure out why it’s doing that.

Standard
Programming, Technical

Auto-generate HTML using a tree in Python

Here’s something I’ve been working on and thought was interesting.

I needed to dynamically generate HTML with varying degrees of nesting and attributes in Python. All I found was a few Stack Overflow questions on generating HTML – and some blogs talking about hard-coding the tags.

This is far from what I needed, so I started to look into making my own – besides, it looked fun!

I started by looking at what I actually wanted:

  • Needs to be easily extensible
  • Needs to be structured in a way that makes sense

With these requirements in mind, I elected to go the tree approach. HTML works much like a hierarchy: since the <html> tag is the root node and the <head> and <body> nodes are children of <html> and so on. Obviously it would be an n-ary tree as a node can have lots of children – or none.

Also, I would have to set up a specific tree traversal that would execute commands pre and post traversal – Namely printing out the beginning and ending tags.

This is what I came up with for the tree structure:


class DefTreeNode(object):
    '''
        A Protocol Definition Tree Node.
        It will contain a single payload (ex: a tag in HTML),
        a list of child nodes, a label, and what to do prefix
        and postfix during traversal.

        The payload is of type 'Generic_HTML_Tag'

        The label is a unique string that identifies the node.
        Both the label and the payload are required to initialize
        a node.
    '''

    def __init__(self, label, payload, contents=""):
        self.children = []
        self.payload = payload
        self.label = label
        self.contents = contents

    def addChild(self, child):
        if child:
            self.children.append(child)
            return True
        return False

    def setPayload(self, payload):
        if payload:
            self.payload = payload
            return True
        return False

    def setLabel(self, label):
        self.label = label

    def setContents(self, contents):
        self.contents = contents

    def getChildren(self):
        return self.children

    def getPayload(self):
        return self.payload

    def getLabel(self):
        return self.label

    def getPrefix(self):
        if self.payload:
            return self.payload.getPrefix()
        else:
            return None

    def getPostfix(self):
        if self.payload:
            return self.payload.getPostfix()
        else:
            return None

    def getContents(self):
        return self.contents

Note that the payload is not a string but (as the class descriptor comment says) is of type ‘Generic_HTML_Tag’. We will get to that in a second. Let’s finish the tree structure first.

Now that I have a node to work with, let’s make the tree. The tree class will contain the traversal code and a “find node” function, and it will hold the root node for the structure:


class DefinitionTree(object):
    def __init__(self, node):
        self.root = node

    def getRoot(self):
        return self.root

    def findNode(self, label):
        return self.recursive_findNode(label, self.root)

    def traverse(self):
        if self.root:
            return self.recursive_traverse(self.root)
        else:
            return ""

    def recursive_traverse(self, node, construction=""):
        '''
            Traverse the tree.
            This algorithm will run a pre-order traversal. When
            the algorithm encounters a new node it immediately
            calls its 'prefix' function, it then appends the
            node's payload, and traverses the node's children.
            After visiting the node's children, its 'postfix'
            function is called and the traversal for this node
            it complete.

            Returns a complete construction of the nodes' prefix,
            content, and postfix in a pre-order traversal.
        '''

        if(node):
            construction = construction + node.getPrefix() + node.getContents()

            for child in node.getChildren():
                construction = self.recursive_traverse(child, construction)

            return construction + node.getPostfix()

    def recursive_findNode(self, label, node):
    '''
        Executes a search of the tree to find the node
        with the specified label. This algorithm finds the first
        label that matches the search and is case insensitive.

        Returns the node with the specified label or None.
    '''

        if(node is None or node.getLabel().lower() == label.lower()):
            return node

        for child in node.getChildren():
            node = self.recursive_findNode(label, child)
            if node is not None:
                return node


Now, let's create the payload for the nodes. This will be done by creating a file to store our 'html tag classes':

 


class Generic_HTML_Tag(object):
    '''
        A Generic HTML tag class
        This class should not be called directly, but contains 
        the information needed to create HTML tag subclasses
    '''

    def __init__(self):
        self.prefix = ""
        self.postfix = ""
        self.indent = 0
        self.TAB = " "

    def getPrefix(self):
        return self.prefix

    def getPostfix(self):
        return self.postfix

    def setPrefix(self, prefix):
        self.prefix = prefix

    def setPostFix(self, postfix):
        self.postfix = postfix


class HTML_tag(Generic_HTML_Tag):
    def __init__(self, indent_level=0):
        Generic_HTML_Tag.__init__(self)
        self.indent = indent_level
        self.generateHTMLPrefix()
        self.generateHTMLPostfix()

    def generateHTMLPrefix(self):
        self.prefix = self.indent*self.TAB + "<html>"

    def generateHTMLPostfix(self):
        self.postfix = self.indent*self.TAB + "</html>"

class HEAD_tag(Generic_HTML_Tag):
    def __init__(self, indent_level=0):
        Generic_HTML_Tag.__init__(self)
        self.indent = indent_level
        self.generateHTMLPrefix()
        self.generateHTMLPostfix()

    def generateHTMLPrefix(self):
        self.prefix = self.indent*self.TAB + "<head>"

    def generateHTMLPostfix(self):
        self.postfix = self.indent*self.TAB + "</head>"

class TITLE_tag(Generic_HTML_Tag):
    def __init__(self, indent_level=0):
        Generic_HTML_Tag.__init__(self)
        self.indent = indent_level
        self.generateHTMLPrefix()
        self.generateHTMLPostfix()

    def generateHTMLPrefix(self):
        self.prefix = self.indent*self.TAB + "<title>"

    def generateHTMLPostfix(self):
        self.postfix = self.indent*self.TAB + "</title>"

class BODY_tag(Generic_HTML_Tag):
    def __init__(self, indent_level=0):
        Generic_HTML_Tag.__init__(self)
        self.indent = indent_level
        self.generateHTMLPrefix()
        self.generateHTMLPostfix()

    def generateHTMLPrefix(self):
        self.prefix = self.indent*self.TAB + "<body>"

    def generateHTMLPostfix(self):
        self.postfix = self.indent*self.TAB + "</body>"

class P_tag(Generic_HTML_Tag):
    def __init__(self, indent_level=0):
        Generic_HTML_Tag.__init__(self)
        self.indent = indent_level
        self.generateHTMLPrefix()
        self.generateHTMLPostfix()

    def generateHTMLPrefix(self):
        self.prefix = self.indent*self.TAB + "<p>"

    def generateHTMLPostfix(self):
        self.postfix = self.indent*self.TAB + "</p>"

 

Great! Obviously this is quite basic and incomplete, but the idea is there. The ‘generateHTMLPrefix() / Postfix() functions are the modifiers. You can add extra parameters and such here without crashing your code elsewhere. Also, you can add additional logic to, say, only add a specific attribute if the tag has it.

Ex: only a few tags have an onload attribute

 

Now that you have, let’s say, a file with your tree and node called HTMLTree.py and a file with your html markup classes called HTML_Tags.py, let’s bring it all together:

import HTML_Tags
import HTMLTree


if __name__ == "__main__":

    html = DefTreeNode("html_tag", HTML_Tags.HTML_tag())
    head = DefTreeNode("head_tag", HTML_Tags.HEAD_tag())

    head.addChild(DefTreeNode("title_tag", HTML_Tags.TITLE_tag(), "basic title here!"))
    html.addChild(head)

    body = DefTreeNode("body_tag", HTML_Tags.BODY_tag())
    body.addChild(DefTreeNode("p_tag", HTML_Tags.P_tag(), "paragraph here"))
    html.addChild(body)

    htmltree = DefinitionTree(html)

    searchlabel = "p_tag"
    print 'Searching for a node labeled: ' + searchlabel

    node = htmltree.findNode(searchlabel)
    if(node):
        print "\n\n" + node.getPrefix() + node.getContents() + node.getPostfix()
    else:
        print '\n\nNode with label \"' + searchlabel + '\" not found!'

    print '\n\nTree Traversal:\n'

    print htmltree.traverse()

The output should look similar to this:

Searching for a node labeled: p_tag


<p>paragraph here</p>


Tree Traversal:

<html><head><title>basic title here!s_static</title></head><body><p>paragraph here</p></body></html>

Great! You dynamically generated HTML in an extensible way!

Standard
startups, Technical

Startup Series: Episode 1

Surprise!

The distant memory that is my last post is not due to me forgetting about this beautiful little bog. I am in the process of starting a new company and, between that and my thesis work, I have not had much time to work on new posts. But alas I am back  – and with an exciting new series that chronicles my journey through the thrilling roller coaster that is the start-up experience!

I have some planned episodes (including this one) and have reserved room for additional episodes as time, and challenges, go by. This series will address issues, main concepts, and directions that I have encountered while running my start-up. Also, for those who are interested, my new company is called R-Gauge Metrics Inc.

Along the way I will try provide some reading material that has helped me with my company. Maybe they will help with yours also.

All-righty then! Lets get started!

 

Episode 1: Inception

Inception: The establishment or starting point of something; the beginning.

I, and I’m sure you have too, have met some people while cruising through life who were extra-keen on starting their own company. This is usually exciting for me and I love to see such motivated individuals strive towards their goals. However, upon further inquiry, I sometimes notice that the person just wants to start their own company for the sake of starting their own company. Their drive is strong and their intentions are good, but without a clear problem to solve, or product to sell, they more resemble the captain of a ship that leaves harbour without a destination to go to: They are more likely to get lost and find trouble.

Introduce the idea. The idea is just that. It is your direction – your purpose. A business does not progress unless it has some goal to progress towards, and, no, making lots of money is not a satisfactory goal to launch a business (although we all hope that we will be able to make lots – we will need something more concrete).

For me, my idea started more than a year ago, and its form was very different than the form I eventually ran with. The original idea was quite simple. I thought many companies were not handling their online reputation well and I thought that I could help them with damage control and bad reputation mitigation. I cited the BP oil spill as an example to my friends while trying to convince them that this idea was the best thing ever:

“Just think. Some company, like BP, has a huge reputation drop due to them doing something – like spilling millions of gallons into the ocean. Now whenever I Google ‘BP’ I get all that bad reputation on the front page: bad for business. What if I was able to provide consulting and help mitigate their online reputation damage.”

It sounded good. I started doing my market research and also looked into who else was working in this industry. Turns out that there were quite a few people who thought the same as I did.

My peers were skeptical. It seemed like a lot of manual work that would only succeed if I did all the work myself (Not really automatable). Not scalable at all. I would only be able to take on as many clients as I could handle myself – or hire lots of people to handle the clients for me.

I decided more discussion and bouncing ideas off of people was in order. I did this for almost six months before the notion hit me: All these sites assume that you know what your reputation is like! How do you quantify it? If I know exactly what it is, I can compare reputations!

It took another six months to generate a rough prototype to see if this was actually possible. Turns out it was! I was on my way!

 

The key lesson I learned here was quite simple: Really spend the time to understand the problem that you are trying to solve and make sure that people care about solving that particular problem. Understand what your goals are and make sure to express what the value of your solution is in terms of your client: why should your client care in other words. Also, note that the idea is malleable. It can change to what the customers actually want. It doesn’t help to make the best product in the world if nobody wants it.

How do you know if people will use a solution to that particular problem? Well, you could ask them. The simple solution is usually the easiest. In my case, I asked anyone who owned a company or was in mid-to-upper management of any company I could find. I asked them if they would be interested in a tool like mine and carefully listened to their responses.

Things were looking really good. I felt that I had a great product that was relevant to people in my target market. They seemed quite interested in it. It was time to look into this whole “starting a company” thing.

 

But how…

Standard
General

Showing Up

Richard Branson talks at length in his book “Like a Virgin” about various topics in the business world. He addresses issues brought up by aspiring entrepreneurs and seasoned veterans in their journey to provide great products and services.

One of those points Richard addresses is the importance to simply show up. I remember reading that section and thinking that I would take this advice with a grain of salt. What if I am competing against some of the best in the world?

I was skeptical.

A few months passed and I received an email from a prominent financial institution; it detailed a contest where Canadian postsecondary students can submit an essay on what their vision of a responsible financial institution is.

I was intrigued.

I started thinking. I am not a financial institution expert or well-versed in what makes them responsible. All I could do was think of my own convictions. What did I think a financial institution that was responsible look like? It ended up looking like a simple essay with a list of suggestions – and it was. I was certain that I would not win, but I felt strongly about it.

I am surrounded by smart people all day. I would wager that most of them are far smarter than me, but I was the only one who entered the contest. All of them said something similar when I asked them if they would enter the competition: there would be people far smarter than them who would write something and win.

I definitely had those same thoughts, but instead of giving up before I had even written a single word I figured that I would at least try. I showed up.

I wrote something that I felt strongly about. Why wouldn’t I show people?

I placed second in the Canada-wide contest.

New York Magazine published an article in February of 2011 that covered research on just this topic. The studies that were referenced identified links in children that were told they were “smart” and their likelihood to try something that was not inherently natural for them. In general they found that children who were constantly praised for their intelligence were more likely to quit when things didn’t come naturally.

I am extending this idea to include self-deprecating mentalities in adults who believe themselves to be intelligent.

Intelligent adults will assess the situation and gauge their ability to succeed based on their own perception of their capabilities. The difference between the study with the children and my extension into the adult realm is that the children actually try before their failure is realized. The adults encounter their difficulty before attempting anything. The result is the same. Both groups do not complete the attempt.

This brings further reinforcement to the saying “you are your own worst enemy.”

I suppose I should be grateful for that mentality as it allows others, such as myself, to try and succeed. I cannot help but wonder, though, what breakthroughs might have happened if those people would actually try.

Sources:

New York Magazine, How Not to Talk to Your Kids: The inverse power of praise, http://nymag.com/news/features/27840/

Like a Virgin: Secrets They Won’t Teach You In Business School, Richard Branson, http://www.virgin.com/richard-branson/books/like-a-virgin

Standard
General Science, Technical

Privacy and Identity

I have always been a (relatively) cautious man when it came to providing personal information online. I know these words come from the owner of calebshortt.com, but willfully disclosing information online is different than providing information that is to be kept private. Different rules apply. Some information can leak though – such as your name in news articles, academic papers, crawlers that scrape social media, etc.

In many talks, courses, and my own discussions I am seeing a trend where the “traditional” sense of privacy, where the idea it to not provide any information unless it is required, is shifting (with the help of social media). This “minimalistic” mentality is great for restricting the dissemiation of personal information – especially online. However, the “new” sense of privacy gravitates towards liberally providing personal information and having complete control on how that information is accessed by third parties (or the original holder of the information).

This is a big change, and it can lead to disastrous outcomes.

Take myself, for example, I limit the information that I add to social media websites (if I use any at all) and I make sure to continually review the privacy measures for each one. I try to apply the best of both “traditional” and “new” privacy approaches. This can work fantastically if you explicitly trust the website (or group) to secure your private information. I even run queries on my name in various search engines to see what are the results. This gives me a rough measure as to how exposed I am to crawlers.

What I did not expect is that, after taking care to secure my own online “identity” and my private information, the weakest link would be my government. I am talking about the current situation with the Canadian Student Loan information breach. A removable hard drive with the personal information of over half a million current, or previous, students disappeared. I was shocked. I suppose I shouldn’t have been.

All of my hard work; circumvented by the carelessness of a person I had never met – someone that I never knew was even handling my personal information.

Through my frustration I have come to be reminded that the weakest link in most security, or privacy, chains is the human link. The link that requires a person the have the correct training, common sense, and authorization to access, transport, and dispose of my personal information correctly and securely.

In my case, this is the second time that a major organization has “lost” my personal information due to a removable hard drive: Note that removable hard drives are usually restricted in general for this reason.

All I can do is take the necessary precautions – now that it’s out there.

Standard
General

“Smart”

When I tell someone that I am a Computer Scientist, and that I am working towards finishing my Master’s Degree in it, many of them remark on how “smart” I must be to achieve such a goal. I am taken aback by this response as I do not view myself as any more intelligent than they are. What, then, makes Computer Scientists fall into such an automatic assumption?

The answer may lie, not in the intelligence of the individuals, but in the way that they interact with their surroundings. Their world.

I am a Computer Scientist, but my skills do not fall solely within that realm. I am an avid baker. I surf and skateboard. I am mechanically inclined and can fix my own vehicles. I can play multiple instruments. I am known to write occasional prose and poetry. I read frequently – and in various topics. I keep up in current events. I have an extensive knowledge of movies and music. I play billiards at the competitive level. I am an amateur scotch taster.

The question is why did I decide to develop these hobbies and skills? The answer, for me at least, is that I was curious. I started baking bread because I was curious how it would work out. I got quite good at it through trial and error. Now, I can bake a decent loaf or two with no trouble at all. I have even made artisan loafs at the request of friends. When I saw a Youtube video of someone playing the ukelele I thought that it would be fun to play. I went to the music store, bought a cheap ukelele, and started to play some basic tunes from online tutorials. Now I can play a variety of songs – which goes well for when I’m surfing.

Many Computer Scientists are just like me. It is unacceptable for them to “not know” what to do if they need to, say, sharpen a knife. They will go out and learn how to sharpen their own knives. If there is a problem, they try to fix it. If there is something they do not know, they try to learn about it so that, next time, they will know. We are constantly learning. This might be brought on by such a fast-paced field – where first-year textbooks can be outdated before the students graduate.

This trait is not limited to Computer Scientists. There are many who are driven to better themselves. Sure, it takes some grades to get into Computer Science, but it takes grades to get into many fields of study. The “smart” that seems to be automatically associated with Computer Science may derive from this need to better ourselves – and solve problems. This builds a large skill-set that helps us solve even more problems.

And solving problems is something that we are very good at doing. Maybe that is what “smart” is after all.

Standard