Categories

Coding, Web design.

development

So it’s 2018 and Jikan is now 1 year old! MyAnimeList announce late 2017 that they’ll be working on fixing up their API but until then I’ll have Jikan running around. I have some plans for Jikan that need to be done, hopefully by mid-2018 or earlier, depending on college.

There are some things I’m still interesting in scraping off of MAL, here’s the list.

User Profile

Taking an example of my own profile;

There’s a lot of data available per user profile. The best part here would be their favorite characters, people, anime, manga and basic stats. The hardest part to extract here would be the user based “About Me” which is highly customizable. So this, I might consider parsing since MAL’s HTML source is already terrible enough.

Top Anime/Manga/People/Characters

These pages give you access to a paginated list of anime/manga/people/characters ranked by their popularity/favoritism by the community from #1 to the last ranking available. Tis a gold mine entry.

Anime/Manga/Person/Character Search!

The official MAL API already has this feature but it only returns the first page of results! It only allows simple string queries and requires user authentication for the API call to work, which is what Jikan is meant to over come. This has been a requested feature, so I’ll most likely be working on a parser for this in the months to come.

Extended Data for Anime/Manga

This has been in the prospect of Jikan since the beginning, but I’ve held off any other extended parsing other than characters/staff and episodes until recently as I begun making scrapers for Pictures, Videos & News related to the item. This trend will continue as there are more pages that consist of interesting data regarding an anime or manga. Especially the reviews page since this has the best data for sentient analysis and averaging of any show or manga.

Will be focusing on these 4 for this year! It takes time to mine pure data since scraping HTML off MAL means a lot of weird and round-about ways of doing things!

Introduction

As the idea of creating my own Anime Database sparked within me, I set out to create parse data from an existing website, MyAnimeList, since I utilize it a lot for managing the content I parse through my mind.

I was dumbfounded when I realized that the official API did not support for fetching anime or manga details. There was a way to do this via the official API but it was totally round-about. You had to use one of their API endpoints where you searched for a query and it would return a list of similar anime/manga with their details.

I could have used AniList’s API but I was already familiar with scraping data. I’ve done this before in a lot of former projects. And so I set out to develop Jikan to fulfill my parent goal; to make my own anime database. And so it took a project of it’s own.

History

Jikan was uploaded to GitHub on January the 11th with a single function of scraping anime data.

It wasn’t even called ‘Jikan’ back then, it was called the ‘Unofficial MAL API’. Quite generic, I know.

I came to terms with the name ‘Jikan’ as it was the only domain name available for the .me TLD and it’s a commonly used word in Japanese – ‘Time’. The ‘Plan A’ name was ‘Shiro’, but unfortunately everyone seemed to have hogged all the TLDs for it.

With this API, I guess you could say I’d be saving developers some … Jikan – Heh.

Enter;Jikan

Sounds like a title from the Steins;Gate multiverse.

Anyways, Jikan can provide these details from MAL simply by their ID on MAL

Anime

Manga

Character

Person

These are the implemented functions as of now. There are some further planned features.

Search Results

The official API does support this. However;

The response is in XML

It only shows the results of the first page

Jikan will change that by showing results for whatever page you select. And oh – it returns in JSON.

Is that it?

Mostly, yes. The reason this API was developed to provide very easy access to developers to data which isn’t supported by the official API. And there you go.

Apart from that terrible reference, it is indeed time to tell you where the anime database stands right now. But first of all, I thought it would be best to clear up what this is all really about since my former related post was just really me typing at 200wpm while breathing heavily as the idea held a cast over me.

What is this sh*t?

There’s a bunch of anime databases out there apart from MyAnimeList, such as Anime Planet, AniDB and Anime News Network to name a few. Websites like these contain anime/manga/novel entries which detail the item. It can be compared to IMDB which does the same – except for movies. Sometimes, it’s useful to integrate a RESTful API which can allow developers to fetch these item details from your databases and add them to their own applications. Because the last thing we want to do is input all the anime/manga data into our own databaes using traditional methods. Why not let the computer do it for us, amirite?

via https://codeplanet.io/principles-good-restful-api-design/

Now, back to MyAnimeList. MAL has an API but it’s very lacking. You can’t fetch anime, manga, people or even character details directly. Furthermore, the output is in XML rather than JSON. 😦

Okay, what now?

So what do we do? We create our own. Let’s say that now we have an API that can fetch any anime or manga data via their link through means of Scraping.

Let’s talk about Scraping. Scraping is a method that fetches the web page and goes through all the nicely written/s HTML code using an algorithm that extracts the information you need from that web page. When there’s no API, this is an only solution. This or we use another service that provides an API but I really wanted to see how far I could go with this project – so why not?

What’s left?

We now have code that scrapes the web page and returns juicy data that you can cache/save/add/whatever. This requires you to provide the algorithm a link to the page you want to be scrapped, but there’s over hundreds of thousands of anime and manga out there. It would be ridiculous to leave that to human hands. This is where the Crawler comes in.

The Crawler

What a ‘Crawler’ generally does is start at some page and scans that page for other links. Those other links get saved and then it visits those links, and this recursively keeps on going and going and going.

Now as the crawler is doing its job, the scraper is going through the newly cache of links that are being populated and gets the data from that. This is basically how search engines index pages.

But we’re making a really specific crawler. What I’m looking for are links to anime entries within MAL, as I mentioned before. Which falls unto this: https://myanimelist.net/anime/{anime id}

The crawler looks for links with this pattern and save them and then we have the scraper go through them and we get an indexed database!

What’s new?

Due to busy college life and other projects, I’ve been unable to pay complete attention to finish this, however as summer approaches, I find myself once again with a lot of time on my hands.

Realizing that MyAnimeList was lacking a simple API to fetch anime or manga details, I decided to create my own. I teased a few screenshots at the end of the previous related post as well. I basically decided to create an unofficial API that lets you simply do what you can’t do from the official API.

Meet ‘Jikan’ – The Unofficial MyAnimeList API

This is the Scraper I’ve been talking about, it’s written in PHP and OOP. So far it can fetch Anime, Manga and some Character details. It’s going to be a lot more, very soon.

Hell, I even got a domain for it: http://jikan.me, although there is nothing to be seen there at the moment. For now, I plan on hosting the API there once it finishes for others to utilize as well with easy. Jikan returns data in JSON format with a simple, RESTful GET request.

It seems I’ve gotten quite side tracked. Right now I have a solid algorithm to fetch the details requires to make an Anime database. The next obvious step would be to make a robust crawler, right?

No.

That would double bandwidth and processing power. Each page will be required to be downloaded and scanned twice. Once for the crawler, once for the scraper. I do realize that I previously used the crawler method and got a list of quite a few anime with their details but it was not until a few days later I realized that MAL had a sitemap.

According to this and this we have two less time consuming methods. The first one is a sitemap for anime listings for crawlers/search engines. Then the second one consists of a method to download a huge list of entries using wildcards in the search. Personally, I have a terrible internet speed and wish to conclude that this works by testing my API against the data it scrapes. The sitemap goes upto 33,000 anime IDs where as the wildcard search results yields more than 107,000 anime IDs! I’ll go with the former that consists of 30~ish % of the entries.

Last year, on the 3rd of April, I lay in excite as I bragged about a design update to whatever audience I had for my portfolio that went from 💩 to something that I would call an achievement that day.

I went on about how content I was with it for the time being until about a week later the excite was replaced with a pit of angst within myself as I pondered on why everything looked so wrong.

Self-learning design has always been about observing it. Letting it sink in then having the option to replicate it. You have the tools. You know how to use them. But in the end, your mind is a blank slate.

And with that said, let’s get into a self-analysis of how I’ve taken it a step further over the year. Below you’ll witness a murder and the subjugator.

Le Subjugator

Le dieded

Now that you know, let us carry on with what was truly wrong with the former design. First of all.

I don’t know what came into mind when I designed this but as far as I can recall, it looked good on paper. What I simply wanted to implement was a Call To Action button that would scroll the viewer down to my projects section.

The button looks like a drop down and that pop out make no sense.

It was late 2016 when I caught up with simple, yet effective SVGs (in detail here). My conclusion was that abrupt ends towards the end of sections, were simply too terrible, especially the way I was executing it. SVGs came along and filled in that gap.

With a much better looking CTA (but still not the best), I managed to compliment a ‘layered’ effect of different shades. Playing around with geometry transitions has always been a favorite.

Unfortunately, I don’t have the files of my former design but those triangles slid in with a 45 degree rotation. The black and cyan-ish triangles you see there are simply squares that are hidden and partially shown on hover through that 45deg tilt. The black box tilted to the right where as the bottom, cyan box tilted upwards. This gave it this sleek look.

The job of the CTA was to scroll the user down 100 pixels to the designated section…

It was not until later, I realized that this was terrible and put some distance between it by bring the ‘about’ section before the portfolio. Talk about proper hierarchy.

The cringe

What you see below that paragraph of cringe is a bunch of icons of the ‘skillsets’ I have. Before this, I had bars that represented this, similar to my 2015 design:

But the thing is, there’s no limits to these skills. Something new comes out every day and I was simply pulling out the experience percentage based on my then beliefs.

Instead, I’ve replaced it with something generic and descriptive. This goes below the about section.

The simple “Biography”

Speaking of the about section, here’s how it looked before.

Talk about a cringeful, long description with font size big enough for sufficient for the elders only.

I’ve changed this entirely. It’s now the first section, so it’s right below the header.

Pretty dank, aye?

Navigation, ahoy

Before we progress, there is one more thing we need to talk about and that’s the sticky navigation bar that follows you down.

Before I had this dull piece of stick

And now, there’s this

The portfolio’s new design is supposed to have more contrast. As you may have noticed (or not), the height of the sticky nav bar is much smaller now. It adds more breathing space to the page by tens of pixels.

There’s also a little border at the bottom to make it ‘pop-out’ with a much more elaborate shadow at the bottom than the former.

The Portfolio

Let’s talk about the portfolio section. Before it was boring cards with direct links to the download or view button and the cards themselves had some design issues. I don’t have a pre-existing screenshot but they looked like the ones now except the images got squeezed or stretched.

I’ve updated that part and divided the portfolio into 3 categories, Client, Designs & Apps.

Taking guess would be enough to realize that the client category is for client work, the design category is for web designs I make myself, be it free or premium. And Apps are app websites I’ve developed.

But that’s not the best part of the portfolio, you may see a familiar design on hover here lacking the download and view buttons. That’s because it’s begging to be clicked.If it’s downloadable, a small download button appears as well. Now once you click on the card, it fetches the details for the item via AJAX and using Bootstrap Modals, we get this beauty.

I can safely say that this is probably the first modal I’ve designed, hence some design issues here too. I need to rework the bottom part, not sure at the moment how though. But this is what I’ve got. It’s quick and simple. Bootstrap Modals are a bonus with User Experience. Click on the X or behind the modal and you’re back to the portfolio.

The Bottom

The contact section may look nearly identical to the former but rest assured it’s been revamped from a user experience perspective. Instead of the good old page reload for the submission process, it now utilizes the power of AJAX for a synchronous update on the spot. It’s acquainted with Sweet Alerts and Google’s reCAPTCHA which provides some mercy on the database.

There’s 860 messages of which 4 are actual messages and the rest of it are spam. My website is quite popular with the bots – heh.

Messages from Bots

Nevertheless, here’s how fast and simple it is now for anyone to drop in a message.

As for the footer, I’ve removed the bulky useless section it had before and replaced it with something simple.

You may have realized, I removed the Twitter Feed! Actually, I had plans for it. Right now I’m using free hosting from 000webhost. It disallows the use of REST APIs and that sucks. I actually have the whole Twitter API, cache, etc ready to roll out. All it’s lacking is the design. I thought I’d put it in the footer but that’s just meh. I thought I should promote my blog posts on my portfolio as well so I’ve been thinking of making an entire new section dedicated to a twitter feed (“recent rambling”) and blog posts (“recent posts”). I haven’t designed anything yet as they both use REST APIs and until I move to a proper limitless domain, I won’t be updating on this.

Enough of the design, Let’s talk about the inside.

It’s what’s on the inside that counts.

I’ve re-coded everything in PHP, still not following the MVC structure but rather my own structure, which is truly odd to explain.

I’ve moved from using CSS to LESS. LESS is basically better CSS syntax. Next up, I’ve started using Bootstrap as the front end framework. As much as I promised to use only made-from-scratch stuff, I’ve really been slowing myself down. Bootstrap has built in modals, grids, etc which really put off workload. Both of these combined proved a faster and easier workflow. I’ve really cut in half the time it takes to code a design.

What now?

I’m still not content with how it looks. I’ve still yet to implement the section that consists of my blog and twitter posts. Also, the website lacks responsiveness (not adapted for mobile or tablets). I’m too lazy to add it now but I’ll probably ninja add it later on.

One more thing!

Branding. So far I’ve not used any logos to represent myself and really needed a favicon (that tiny icon you see on your tab before the web page’s title). So I followed my life motto, “Why not?”, and thus utilized my expert GIMP skills (2poor4photoshop) and came up with the following.

As you can see, it looks terrible at the edges, but that’s not going to show in a small logo or favicon and I’m too lazy to perfect the small details it at the moment. Maybe later on?

This concludes the berating of my former portfolio and what I did to upgrade it.

So, I’ve redesigned the portfolio. Sort of. Even re-coded the backhand so it’s OOP that makes actual sense. That’s the gist of it.

SVG

It’s been used around a lot since the past year and I thought “why not?”. I made some really simple SVGs that are right-angled triangles to give the edges of containers a padded and – er – better look? THEY LOOK GREAT, and that’s what I believe matters? Okay. In addition, these are vector graphics and are ludicrously small in size. Going to use them more in further projects after I get a good grip on them.

BOOTSTRAP

The only reason I’ve started using bootstrap is because of its preset configuration, glyphs and the responsive grids. Oh, the responsive grids – never have I ever made something without spending much time on the frames.

I pat my self for making it look better but I still believe it’s got it’s ways to go before it reaches perfection. The thought of upgrading the current design rather than making a complete new one was really proper choice.

WHAT’S NEXT?

I’m planning a twitter feed and a cool little vertical ‘timeline’ under my bio which would show which programming language/feat I achieved at which year.

I also think that the SVG triangles are a little to large? Not sure, but I’m going to play around with that.

There’s been delays but it’s here. The Alpha version of the CS2D log data extractor, Project.Extract Cloud, is up and running. There’s are some stuff left to do. I’ll explain this in a second.

Other than that you can only extract 1 file. I might as well set this as the limit. I’m gratified to be hosted for free by BroHosting as a testing for their hosting services and so far there’s absolutely no critical problems.

WHAT YOU SHOULD CHECK OUT!

That would be the server statistics functionality. The core of the application lies within there. Feel free to drop in whatever log of your choice and get as much as information out of it as possible!

TODO!

Text Searching

The text searching page right now is bare minimum, it’s simply 5% done. It’ll look more polished and organized like the ‘server statistics’ page.

User Database

User database will be a offline feature only of PE4, it’ll automatically store player information as a database for you to easily access.

Server Statistics Polishing

As complete as it looks, it’s still a bit far from done. First off, the map graph you see is a complete dummy. It’s not implemented at all. Secondly, there are some design polishing I need to do. Apart from that I want to see if I can fit in more data and graphs in there.

Usage Statistics

You’ve probably noticed a blank space in the black bar at the top after you click it. What’s meant to be stored there is a graph of your usage statistics of the browser app. The core functionality of this is complete but I’m planning to add the graphs and such at the end.

PRIVACY

Some of you might be wondering about the log files that you’re uploading to the server. I’ll let you know before hand that these log files are stored. The reason for this is that they’re cached incase you reload the page. An JSON format of the extracted contents are stored as well.

When I release beta, what I said will still be applicable to your offline version of Project.Extract but the cloud version won’t store anything. Nothing will be cached.

So, I’m back with another blog. I do realise that the previous one (bootyphpandi.wordpress.com) was lacking a decent name and so I took it upon myself (again) to bring it to a professional state. I had plans of making my own CMS but again I realised that I’d have to hit up AngularJs and some more alpha type stuff to make it look like a decent CMS. Plus due to the lack of time, I’ll be using this as my official blog.

Shoutout to Tonal theme as I really love this minimalistic freebie.

Some Updates

Portfolio Polishing

I updated my own portfolio (irfandahir.com). The design was left unfinished so I polished it out a bit after recieving some insight and critique from forum boards. I still feel it’s lacking so I’m devising plans to make it look nicer.

Introducing Omilos

I’ve made another freebie, Omilos. I thought “Omilos” was greek for “something big” but my greek buddy corrected me once more. Nevertheless, the design is upto the level of being used. IMO it is lacking some design fundamentals and has some flaws but it will get your job done as it’s coded as cleanly possible. If you hire a designer or have some coding skills, you’ll be able to adjust it to your needs.

Project.Extract 4

If you’re a CS2D player then you might know what this is, if not then here’s a breif explaination. Project.Extract (including legacy versions 1, 2 & 3) have been downloadable apps which run through your browser with the dependacy of Apache & PHP (WAMP, LAMP). CS2D generates a fair amount of logs files and so I created a PHP Library which would extract useful amounts of information from these logs. And Project.Extract is the visual version of that. The Legacy versions 1-3 only extracted user information and had text based searching.

So a year later, after leveling up multiple times in PHP I realised I could extract so much more. I’ve developed a PHP Library, Log Miner, for it which acts as a core for Project.Extract 4. Both the PHP Library and PE4 are in works. The difference between legacy versions and this is that this has the capability to extract ALOT more from your logs. Every single detail. And the awesome part? It’s both a web based app if you don’t know how to set it up and downloadable which removes limits. I’ll talk more about it once I’m ready to deploy it.