Login

Categories

Archives

Author: wizardofbots

Tonight I was watching Mr. Robot chapter 4, season 2. And it remind me back the good old days where the IRC was above any other social network. People meet there in tons of channels to have chats and discussions. Also there were plenty of groups talking about many stuff, the best crews were the ones with coders.

So well, what do I like about IRC?, There were plenty of cool things back then in 1995, there were the amazing eggdrops that you programmed to respond to different messages. They also had TCL’s which mean addons/plugins that you could adapt to your bot, many were different cool games in group. Also there were PsyBNCs to be always online with your shell.

Hell yeah, this also deserves the place along nightmareJS, even though this doesnt use Electron for simulating browser, but a headless parser native libxml C bindings that will do a great job.

To start, you have to make sure you have previously installed nodejs libraries along with npm and if you get an error for the libxmljs library, make sure that you install this:

1

2

3

npm install node-gyp

npm install libxmljs

npm install osmosis

And then it should be working properly if you create a file.js and run the example script.

Here you have a bunch of examples to copy and paste to test it. In order to explain it further please ask your questions or requirements for video tutorials. I might open a premium spot for it, so you better fucking invite me a cup of oil so I can continue my compromise with you.

TMUX is tha maaaaan! Sure as death that you are going to love this! Remember when you ran a scan on a remote VPS? every time that you left it working it got the broken pipe, so you lost the session and you don’t get the results you wanted till the end of the code.

But with TMUX, you can leave running many SSH “sessions/terminals” working many processes. You can leave scrapers, scanners, crawlers, and even an app working on nginx/apache. Its like using all the process of your VPS and not losing any single fucking penny!

Its something like this:

So let’s get physical now, for having tmux, of course you need to have linux on any provider you want, it can be 512 MB RAM, $5 USD on Digitalocean. Then once you have it, connect via SSH on the terminal like: ssh [email protected]

Once you are in, make sure you have tmux installed:

sudo apt-get install tmux

Then, to open make sure to type tmux on the terminal and it will open.

So basic navigation on the tmux would be this by default:

Ctrl-b c Create new window

Ctrl-b d Detach current client

Ctrl-b l Move to previously selected window

Ctrl-b n Move to the next window

Ctrl-b p Move to the previous window

Ctrl-b & Kill the current window

Ctrl-b , Rename the current window

Ctrl-b % Split the current window into two panes

Ctrl-b q Show pane numbers (used to switch between panes)

Ctrl-b o Switch to the next pane

Ctrl-b ? List all keybindings

The trick here is that you can have as many SSH windows you want doing Ctrl-b c and then running other process that you can navigate with Ctrl-b n and Ctrl-b p (from next, previous) , else you can have your window split if you do Ctrl-b % and you can watch all in panels.

IMPORTANT: If you want to exit TMUX and leave the session working with the scripts you ran, you have to do CTRL-B D which means it will detach.

Then afterwards, when you come back from your procrastination you can go and type:

tmux attach

Which will make you restore your tmux session that you left working with other processes.

Then when you come back to your session, you will see everything is finished or keep running extracting more data

So now you know how to take the most of what you pay for the VPS, hope you like it and leave the love dudes ;).

Damn, this is so fucking awesome. Automating your browser using Electron with NightmareJS. Electron is being funded by Github and its competitor is NWJS funded by Intel. So we have a good platform to simulate browser and interact with it. Using PhantomJS on its core.

So if you want to start just make sure you have installed everything for the NodeJS dev environment, that includes:

Sometimes you just want to make your life easier on deploying a server in seconds with all the dependencies you need to run many languages or maybe a LAMP config easily.

I made this whole compilation guys, but right now I dont feel like finishing it. So I decided to add a github repo so you can just clone and run it very quick. But I will be updating this post frequently. Just make sure to comment in order to report a bug or something that needs to be modified, also there https://github.com/wizardofbots/wizardinstall

So I decided to call it wizardinstall 😉 we will be adding the repos you need in order to have everything ready to start on a $5 cents digitalocean server to have a linux shell on your smartphone doing a realtime connection and executing your scripts!

So check out the code, you will learn how to use it, and understand the logics, will add comments on each line if necessary so you guys understand:

wizardinstall

Shell

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

#!/bin/bash

# ******************************************

# Program: Dev mode install

# Developer: Wizard of Bots

# Site: http://wizardofbots.com/network

# Github: https://github.com/wizardofbots/wizardinstall

# Date: 16-07-2016

# ******************************************

# this line below is checking if the lsb_release -is comment response is

# equal to Ubuntu and also Debian, because we can use apt-get instead of yum.

I made an article about this because it took me about 2 hours to solve it. Seems like all sources are outdated with their methods. Many people were talking about using module html2text which is now deprecated, then many others recommended nltk and well.. The final result is that now BeautifulSoup does a better job that them. Buuuut…..

All resources were saying about this function called get_text() which is completely incorrect, it must be getText() camel case in order to make this conversion work.

Why I would need to do this? Well, specially when some networks try to encode with strange HTML DOM characters that makes your scraping a nightmare. But here is the chunk of code explained.

Python

1

2

3

4

5

6

7

8

9

10

11

12

fromBeautifulSoup importBeautifulSoup asSoup

importre,urllib2,nltk

url='http://google.com'

html=urllib2.urlopen(url).read()#make the request to the url

soup=Soup(html)#using Soup on the responde read

forscript insoup(["script","style"]):#You need to extract this <script> and <style> tags

script.extract()#strip them off

text=soup.getText()#this is the method that I had like 40 min problems

text=text.encode('utf-8')#make sure to encode your text to be compatible

#raw = nltk.clean_html(document)

print(text.encode('utf-8'))

So you now know how to get text from a response, it will be now easy to get some data using Regular Expressions 🙂

Simple HTML DOM is a PHP library that helps you parse the DOM and get to find things inside the DOM very fast, instead of using plain PHP that will take you hours to make your own libraries. There is another similar that is called PHP Selector.

What we want to do is to grab the results of Bing in this case.

PHP

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

<?php

$keyword=$argv[1];// send the argument when you run the script like = php this.php your keyword

require_once('simple_html_dom.php');

$bing='http://www.bing.com/search?q='.$keyword.'&count=50';

// We do it with bing but it is almost the same with the other searches.

echo'#####################################';

echo'### SEARCHING IN BING ####';

echo'#####################################';

$html=file_get_html($bing);

$linkObjs=$html->find('li h2 a');

foreach($linkObjsas$linkObj){

$title=trim($linkObj->plaintext);

$link=trim($linkObj->href);

// if it is not a direct link but url reference found inside it, then extract

Well this is a post about what we will be looking around this network, mainly we will teach you how to use various techniques to automate the web to your needs.

Also we will compare different technologies on different languages and so with libraries/modules that will make your life easier.

This community is based on many works that will be done thanks to the landing page and the reputation that this will raise, the private/custom bots tutorials will be placed as private and you guys will need to pay to see how we did it and with which tecnology, then we can discuss how to improve those techniques and achieve a very big community supporting each others.