Ad

Saturday, December 29, 2018

Some good files to ignore .DS_Store a hidden file generated by Mac, .ipynb_checkpoints, hidden files generated by Python Notebooks aka Jupyter Notebooks. Large image datasets used for machine learning and deep learning training. .gitignore is also prefixed by a dot, it is a hidden file, usually doesn't show up in finder or file viewer, but can be opened in sublime.

Sunday, December 2, 2018

Tensors are the basic units of deep learning frameworks, neural networks functions calculations. A one dimensional tensor is like a list of elements. A two dimensional tensor is like an excel sheet it has a row dimension and a column dimension. 3D tensor is like an RGB marked image. Each pixel has a red, green, blue value, makes each pixel representation 3 dimensional.

Monday, September 24, 2018

In some cases stop words matter. For example researchers found that stop words are useful in identifying negative reviews or recommendations. People use sentences such as "This is not what I want." "This may not be a good match." People may use stop words more in negative reviews. Researchers found this out by keeping the stop words and achieving better prediction results.

Removing punctuation may also yield better results in some situations

Tokenization : breaking texts into tokens. example: breaking sentences into words, and more group words based on scenarios. There's also the n gram model and skip gram model

Basic tokenization is 1 gram, n gram or multi gram is useful when a phrase yields better result than one word, for example "I do not like Banana." one gram is I _space_ do _space_ not _space_ like _space_ banana. It may yield better result with 3 gram model: I do not, do not like, not like banana, like banana _space_, banana _space.

ngram : n is the number of words we want in each token. Frequently, n =1

Lemmatization: transform words into its roots. Example: economics, micro-economics, macro-economists, economists, economist, economy, economical, economic forum can all be transformed back to its root econ, which can mean this text or article is largely about economics, finance or economic issues. Useful in situations such as topic labeling. Common libraries: WordNetLemmatizer, Porter-Stemmer

Exploratory Data Analysis

Histogram plotting, input is a list of distributions we want to plot, specify bins, can also weigh each sample differently, it doesn't have to be count 1. hist function can return values. How many items in each bin, and the plot.

It is also important to do feature extraction, simply the data, reduce computational cost, dimensionality reduction before feeding data into a machine learning algorithm. Algorithms will run faster, more efficiently, use less memory space, and even perform better, in some cases.

Anomaly detection, outlier detection to handle or remove outliers and abnormality in the data to help the model generalize better and be a more accurate representation.

Machine Learning

Machine Learning is emerging as a popular field of data science. It has predictive power, employs applied statistics and pattern recognition technologies.

Machine learning is taking data mining to the next level.

Major machine learning tasks include classification, regression and clustering.

Questions that Business Analysts and Decision Makers are Interested In

Who are the best customers? aka Who are the customers with the best Customer Life Value

Causal relationship:

Results of recent experiments (More prevalent in Startup Culture)

Hypothesis if one segmentation is actually different from another

Is the result significant or is it random chance

Please note that causal relationship determination requires controlled studies to control for extraneous variables. In many industries, such as biotech, statistical significance is a must, a prerequisite for next step analysis or more business investments.

Sunday, August 5, 2018

Did you know that having a daily routine improves efficiency and productivity? Mark Zuckerberg of Facebook famously wear the same grey shirt and hoodie on a daily basis to simplify wardrobe choices and save minutes each day.

Automate everything: use API connectors to connect applications such as Gmail, Shopify and Trello without coding: Zapier, IFTTT, Do Button

Join an online initiative to go complaint free for one month and induce positivity in your life https://gonoco.com/

Easily distracted? Having trouble finishing meaningful tasks? Try a 30/30 timer rule: switching tasks every 30 minutes. There are iOS apps that time you and chime for you to make a switch and move on. http://3030.binaryhammer.com/

Social Network, Social Marketing and Growth Lifehacks

Use Tweepi to flush Twitter followers that are inactive or don't follow you back

Personal Finance Productivity

Use a stock, mutual fund screener to find stocks and funds that match your investment goals for your 401K plan

Developer Productivity:

Pair programming for productivity - AirPair and Pivot Labs, a premium development consulting agency for startups and new technology companies, talk about pair programming for developer productivity http://www.airpair.com/pair-programming/

Always look for shortcuts and do more things faster. Some developers even use fast notetaking apps like notational velocity and combine it with hot keys to shave fraction of seconds off their daily routine.

Code a mobile app without learning iOS development or Android ionic framework http://ionicframework.com/

Startup Productivity

Use prototype and wireframes as visual aid to communicate product visions and designs, clearly.

Did you know that having a 3D printed prototype generate 3x more feedback for architecture and physical product designers than just having a concept drawing?

Did you know that famous universities like Stanford teach students to print or draw iOS UIs and designs on paper and walk user through imaginary steps to get design feedback before they code?

Looking for great business ideas? Use a startup name or domain generator to get inspired!

Udacity Launches AI for trading with WorldQuant, also its hiring partner. Ready to do artificial intelligence for fintech, this may be your nanodegree! What's the ultimate dream? Probably join a quantitative traded hedge fund, eventually. It is said that a little less than 30% of all US trades are done by computers. Specifically you want python for finance and historical data skills.
- https://blog.udacity.com/2018/08/introducing-the-artificial-intelligence-for-trading-nanodegree-program.html

Author Adam Fisher launches Valley of Genius as told by the hackers, founders, and freaks who made it. If you like HBO's Silicon Valley, you will probably like these unicorn and innovator stories of Silicon Valley

Great Escape! Medium is running an August author challenge: tell Medium why and how you quit your job! https://medium.com/s/greatescape/tell-us-about-the-best-time-you-quit-your-bad-job-aaaf6d5b4e20 Your story may be featured. See this challenge post by Medium's editor.
- https://medium.com/s/greatescape

What does it feel like to be Steve Job's daughter? Her memoir now available for readers. See this article on Vanity Fair.
- https://www.vanityfair.com/news/2018/08/lisa-brennan-jobs-small-fry-steve-jobs-daughter

Youtube Machine Learning Artificial Intelligence celebrity Sraj wants to start his own School of AI. He wants it to be a "nonprofit". Strange but true. He's now recruiting Deans to head cities.

BIDW may employ more stable, heavy duty and less flexible architecture, schema and data store than startups in the Silicon Valley. Such may be a sacrifice for security, stability which many fortune companies rely on.

Structured Query Language (SQL)

Despite the popularity of many new data stores and technologies such as Hadoop, Spark, Pandas etc, many companies still require Business Analysts to be fluent in sql. Never forget SQL.

Graphical User Interface (GUI)

GUI interface helps business users query and drill data without the help of the development department. The schema and database are still designed and implemented by dev.

Online Analytical Processing (OLAP)

Provides a GUI to query platform for business users to do data explorations with minimum help from dev department.

Analysts and decision makers can quickly and efficiently do data analysis and ad hoc reporting without too much help from a data scientist or database administrator.

The schema, reports, and drilling depth may need to be pre-planned, designed and tested before being released to business users.

This is also a large scale system, suitable for companies such as Macy's, Gap, Walmart which have millions of new sales record per hour.

OLAP is for data exploration by large businesses.

Data Warehousing

Data Warehousing is a serious challenge for large companies with many transactional records, product offerings across many departments.

Many DW providers can also provide integrated data mining, business intelligence services build on top of proprietary DW hardware (including server stack) and software.

Best Practice

Sales teams on-the-road often needs faster, better data information on mobile devices to seal a deal. Don't be surprised if they get mad when numbers are off! They bring home the dough.

Questions that Business Analysts and Decision Makers are Interested In

Who are the best customers? aka Who are the customers with the best Customer Life Value

Causal relationship:

Results of recent experiments (More prevalent in Startup Culture)

Hypothesis if one segmentation is actually different from another

Is the result significant or is it random chance

Please note that causal relationship determination requires controlled studies to control for extraneous variables. In many industries, such as biotech, statistical significance is a must, a prerequisite for next step analysis or more business investments.

SELECT, INSERT, UPDATE with SQL

The Equivalent of HelloWorld of SQL

SELECT *

FROM table_name

Select all columns and rows from a table. In real life practice, we may want to avoid using SELECT * because it may be asking and displaying a lot of unnecessary records utilizing our precious computing resource, especially for large systems, companies with large databases.

A Basic Select Statement

SELECT ProductID, NameFROM ProductWHERE Price > 2.00

A Fancier Select Statement

SELECT * FROM CUSTOMERS WHERE AGE > 25 AND SEX = 'F' AND REGION='CA'

The * means all, specifically all columns and rows in this statement. All columns and all rows will be returned.

An Insert Statement

Useful SQL interview skills

Be able to compose advanced sql queries including aggregation, slicing and dicing.

Advanced SQL Query Select Count and Group By

It's easy to use SQL to display all the data columns and rows. But that's not practical. It's not practical for the business user to get the entire database, nor is it memory efficient.

How to view aggregate data? Use Group By, don't forget to use Count() too, else the result is again not meaningful.

SELECT COUNT(CUSTOMER_ID), STATE

FROM CUSTOMERS

GROUP BY STATE

ORDER BY COUNT(CUSTOMER_ID) DESC;

Group By helps aggregate and filter out data. In this case we are interested in aggregating data by State in the Customers table. What kind of state wide information are we trying to get? We are trying to count the number of customers in each state, as measured by customer_id. In addition, once data is aggregated, order the results in a descending order by count(customer_id) the largest count to the smallest.

Compare a Select all statement which just returns all the data rows
to
Select Count() and Group By statement that aggregates data by country

SQL is great for the following queries:

SQL Segmentation example, analyze by location, select * from sales group by location

Spark and the new way to run SQL queries on structured, distributed data

Firebase real time database and JSON

JSON objects

NoSQL databases like MongoDB

SQL Security

Cross Site Scripting and SQL Injection

If allowed to enter special characters in input boxes and forms on a website, hackers may use code to run SQL queries against your database and get data illegally about your website. Many websites do not allow special characters, such as yelp. Some websites stringify the user input before processing it on the server so special characters are turned into strings so to reduce security risk.

Thursday, August 2, 2018

The goal is for mobile developers to load images onto mobile applications when limited memory is available.

Android drawable images @drawable/my_img can be set as the source of an ImageView. Image file extension is optional. Drawable refers to the fact that the image can be drawn on the screen. Android manage all drawables in a res/drawable directory.
https://developer.android.com/guide/topics/resources/drawable-resource

Drawable supports mainly bitmap format including .jpg, .png, .gif. The unit element for these images is a pixel.

Density independent pixels (DIP) allows ImageView to scale and resize across screen sizes and pixel densities - across the wide variety of Android devices. Specifying button size using dp instead of px make sure the button is still reasonably sized and clickable on high resolution high density screens (high number of dots or pixels per inch).

Best practice to keep file size small is to include different image sizes for handling different dip's. Android does this automatically and load the corresponding dip drawable assets into the right folder: hdpi, mdpi, xhdpi, xxhdpi.

Developers also use ImageMagik to compress photos and Android Drawable Importer to convert images to drawable https://plugins.jetbrains.com/plugin/7658-android-drawable-importer

Bash can improve developer productivity. It is available on Mac via terminals. Developers can use bash to write build scripts, enhance dev productivity, use curl to visit and process websites, interact with file systems, modify files, pipe outputs into files.

SVM can use other functions to make data linearly separable. SVM can give non linear, intricate decision boundaries. SVM Decision Boundary is a straight line for linear SVM. Apply linear SVM. If it has 0% error, your data is linearly separable.

c parameter SVM controlls trade off between smooth decision boundary and classifying training points correctly (may not generalize well, get a smooth boundary or get more points classified correctly). Effects of C especially obvious in the RVF kernel. A large c means get more training points correctly. Larger c --> more intricate boundaries

Gamma Parameter
Gamma defines how far the influence of a single training example reaches. If gamma has a low value each pointer has a far reach, if gamma has a high value each point has a closer reach. A high gamma value will make decision boundaries pay close attention to those points that are close, but ignore those that are far. High value of gamma could mean a very wiggly decision boundary.

A point close to the frontier can really have a lot of weight and pull the frontier close to itself. Versus a low gamma, means more points will have weights of influence on the frontier, so the frontier end up being smoother.

Sunday, July 29, 2018

Modern games require design addictive cycles to keep the gamers engaged.
It's a big deal because can shake your moral ground. Making a game addictive is both making a successful product but also potentially doing harm to gamers.

Game Algorithms

In-Game Economy Design
Virtual goods are all the range. That's how a lot of freemium games and social games make a buck these days.

Online Social Games
Examples include Facebook games like FarmVille: get lots of traffic, viral factor, millions of people can play it each day (at its height 100 million plus players play online social games each day)

Some gaming companies got so huge, they entire focus shifted to analytics instead of game design.

Concept - Gamification
Making things that are not pure games have gaming elements and incentives to drive results. Gamification takes advantage of fun and addictive gaming mechanics to encourage results.

Thursday, July 19, 2018

Explore <s, a> ---> s' reads: move from current state s to s' via action a. Through the action a reward is received, it can be positive for positive reinforcement, negative for punishment or discouragement. As the robot explores the environment, the agent will update the Q table which tracks the scores of accumulated scores.

Bellman Equation is one of the utility equations used to track scores.
U(s) = R(s) + ɣ max_a Σ (s,a,s') U(s')
The function none linear. This fancy function means current utility is a function of reward, a multiplier or a fraction of the max total future actions and future rewards.

Start with arbitrary utility, explore, and update based on allowed neighboring moves, based on the states it can reach. Update at every iteration.

Wednesday, July 18, 2018

F1 Score is an useful metric of classification models rather than regression machine learning models. It is an useful metrics for models that also go well with confusion matrix. F1 score is an useful machine learning metrics aka performance score that is also frequently used in statistical analysis. You can read more about F1 score on the wikipedia page and also the sklearn F1 score documentation below:

F1 Score and Accuracy scores are both used in classification tasks. Accuracy score has some shortfalls. For example, if the dataset is obviously biased. For example, if most of the input data is negative (of the negative class only), say 99.99%. Then the machine does not need to explicitly learn anything intelligent. It can just guess "negative"every time, it will still be 99.99% accurate. F1 score is a shorthand to measure a composite score of the confusion matrix - true positive, true negative, false positive, false negative.

F1 score is a combination of recall and precision. It also a shorthand to measure how accurate and useful the result is.

Accuracy is a simple fraction of correctly classified objects over total number of objects.

It can be misleading to only focus on accuracy, especially when data labels are imbalanced, even if data is representative. Certain scenarios are simply more prevalent in the population data. For example, by definition orphan diseases are the minority data points in the real world.

Udacity is pricey. You are on a budget and desperately need a better job. Here are some great tips to take advantage of your Udacity subscription.

Udacity Career Partners and Career Hub

Udacity offer video tutorials for technology as well as how to write a resume, start a startup and more. In addition, each nanodegree is created in partnership with top tech companies, take advantage of these hidden connections. Reach out to content creators and industry leaders from Google.

Udacity Career Conferences

It's real. It works. There are actually top Silicon Valley companies come to review your resume and interview you in person. Highly recommend. I can give a lot of personal anecdote about how well this worked out for me.

Udacity Career Profile

Completed multiple nanodegrees? You can turn your "ADHD" and inability to stop learning forever into a career advantage: show that you had a the grit and resourcefulness to complete multiple nanodegrees on your career profile. Update it regularly.

Make your Capstone Project Portfolio Ready

These days, companies hire if you have a great portfolio not a great label. Turn your capstone project into a recruiter ready, professional medium post, a github repo, Linkedin ready slides or PDF. Do this while completing the capstone. It is so much easier. Once you are done with the nanodegree, it's really hard to go back.

For example, Udacity digital marketing project slides are presentation ready. And you get real-world experience marketing for Udacity on Adwords, Facebook and Instagram.

Mentor

Though not always helpful, Udacity Nanodegree subscription does come with an online. Remember, you can always request to change mentor if you have any trouble.

Tuesday, July 17, 2018

You are here to win and start a startup, but the journey of being an entrepreneur can be lonely, especially if you are a solopreneur. Have you thought about starting a meaningful relationship while you are here? Here are some tips and resources for you.

Hinge: a professional Tinder like dating app but often for Ivy League educated young professionals

Coffee Meets Bagel blog: another dating app offer some advice on their blog.

Brainstorm Startup Ideas and Domain Names Using Generators

One surprisingly easy hack is to precede the startup name with "try" or "get", example: getAlto, tryAlto

http://itsthisforthat.com/

Bootstrap - Front End Framework

Previously Twitter Bootstrap is a super popular framework for front-end development.

Use GSuite or Google Domain to Host Your Custom Domain Email

Want to have email@mycompany.co instead of mycompany@gmail.com so that your company looks official and trust worthy? Use Google Work's gmail hosting or the email forwarding service of Google Domain (only works one way - only receives custom domain emails).

Material Design - Front End Framework and Stylebook

Flat UI - CSS Framework

Prototyping, Wireframing Tools

Invision, Marvel

Marvel can easily mock up mobile apps in minutes, for free.

https://www.flinto.com/

Free Professional Apps for iPad

Expensive apps like Adobe and premium MailChimp features are actually available in various forms of iPad apps. You can use advanced features for free! MailChimp even have an offline app for collecting emails at events and conferences.

Fiverr

Get gigs done on Fiverr for $5 dollars and up.

Reddit, Product Hunt, Imgur, Hacker News - are all important social networks for founders at Startups

Unsplash - Stock Photo

Code Libraries

In addition to frameworks, there are also jQuery UI, WordPress themes,

WordPress Themes

Startup themes are available for purchase on ThemeForest. These templates will make your website look instantly like a legitimate startup. However, for WordPress speed is a serious concern. Without the snappy speed, startup websites will give off the wrong vibe. How can you raise funds for your tech startup if your website is slow?

Bootstrap, a popular front-end frame, can be easily integrated with WordPress.

Hire Designers on 99 Design and Fiverr

Splashthat - Make Instant Event Invite Pages

These pages are called splash landing pages. Splash refers how instant and short-lived the pages are, usually used for a particular event or a purpose, or an Optimizely experiment.

Co-working spaces are more than just physical spaces for an office. Those are also great places to connect and meet with people. It is a real community valuable for entrepreneurs, especially solopreneurs.

Mock up API calls https://www.mockable.io/

Web Scraping tool http://scrapy.org/ Careful most sites are protected with Term of Use, which generally prohibits scraping for commercial purpose.

While learning to code, bettering your coding schools, online learners should avoid getting stuck in the ocean of tutorials and videos - do not get stuck in learners' limbo. It is impractical to know all the details of a framework. Not every pilot knows how to build a plane, not every machine learner needs to know all the math of all the algorithms!

LittleBits teaches hardware and software engineering experience to kids. It is slightly more accessible than Raspberry Pi. Comes with a variety of sensors and components, such as pressure sensor, light sensor, temperature sensor etc.