How To Ask Questions The Smart Way

Just doing some light reading today and ran across “How To Ask Questions The Smart Way” by Eric Steven Raymond.  I read this years ago, an instructor in college had us read through the whole thing.  Of course I forgot all about it and now it’s a new gem I’ll reference now and again.  Pay attention to the When You Ask section, I think it lays out each pretty well.

How To Ask Questions The Smart Way

 

Citrix XenServer 6.1 – XenServer Tools out of date (version 6.0 installed)

Recently we upgraded our XenServer resource pool to 6.1 because the new feature allowing you to do live migration of VDI’s sounds unreal.  I still get amazed that this stuff works, well most of the time.  Now we are having issues with XenTools reporting to be out of date in XenCenter, I see the message:

XenServer Tools out of date (version 6.0 installed)

Great, I’ll just pop the XenTools CD in, uninstall the old version, and install the new version.  Nope, that does not work.  Instead it makes the issues worst, as you try and reboot the system but XenServer never figures out the server is shutdown and you end up having to do vm-reset-powerstate or even destroy the domain.  Either way this sucks.

Citrix just updated it’s XenServer blog yesterday acknowledging the issues with XenTools is real and they will try and fix their processes.  And they are right, this sort of thing really does hurt your confidence in the system.  And we all know virtualization is built on confidence.

Now here is the real place to get all the info on what needs to happen.  It’s a two part update, first you need to apply hot fix XS61E009 and then XS61E010.  Then you have the fun of updating XenTools.  But lots of work non the less when you have 10s, 100s, 1000s of VMs.

http://support.citrix.com/article/CTX135099
http://support.citrix.com/article/CTX136252
http://support.citrix.com/article/CTX136253

*UPDATE*

If you run in to blue screen on Windows while installing the new XenTools from the XS61E010… I know more issues??  If you can get into safe mode to remove xentools from the machine and reinstall, that may work.  Other wise you can try looking yup your VMs UUID and changing the device ID.  This may work if your VM does not get the correct device ID set during the new xentools install. Use this command:

xe vm-param-set uuid=<vm_uuid> platform:device_id=0001

or

xe vm-param-set uuid=<vm_uuid> platform:device_id=0002

Daily Show Lays Out the Michigan Bridge to Canada

I love the Daily Show and normally I don’t repost videos from them but this is just gold.  I can’t believe how easy the rich can control TV viewers.  Great job daily show, you hit the nail on the head with this one.

http://www.thedailyshow.com/watch/wed-january-9-2013/bridge-to-canada?xrs=share_copy

Citrix XenServer System Recovery Guide

I found this white paper from the early days of XenServer.  I think it’s worth a once over as most of the information and logic remain valid.  The flow chart of recovery steps makes it look pretty simple and can maybe help someone out of a jam.  This is definitely something you want in your disaster  recovery plan!

XenServer System Recovery Guide

Orignal source: http://support.citrix.com/servlet/KbServlet/download/17140-102-671536/XenServer%20System%20Recovery%20Guide.pdf

SMSTools GSM Equipment and Network Error Codes

At work I’ve got a Nagios server setup to send text messages to my phone on critical notifications.  Lately something has been happening that causes sms notifications to fail.

In smsd.log I find that I get the follow error message:

+CME ERROR: 515 (device busy)

As root I’m able to reset the modem by doing this:
/bin/echo "AT+CFUN=1" > /dev/ttyUSB0
/etc/init.d/smstools restart

GSM Equipment related codes

Error Description
CME ERROR: 0 Phone failure
CME ERROR: 1 No connection to phone
CME ERROR: 2 Phone adapter link reserved
CME ERROR: 3 Operation not allowed
CME ERROR: 4 Operation not supported
CME ERROR: 5 PH_SIM PIN required
CME ERROR: 6 PH_FSIM PIN required
CME ERROR: 7 PH_FSIM PUK required
CME ERROR: 10 SIM not inserted
CME ERROR: 11 SIM PIN required
CME ERROR: 12 SIM PUK required
CME ERROR: 13 SIM failure
CME ERROR: 14 SIM busy
CME ERROR: 15 SIM wrong
CME ERROR: 16 Incorrect password
CME ERROR: 17 SIM PIN2 required
CME ERROR: 18 SIM PUK2 required
CME ERROR: 20 Memory full
CME ERROR: 21 Invalid index
CME ERROR: 22 Not found
CME ERROR: 23 Memory failure
CME ERROR: 24 Text string too long
CME ERROR: 25 Invalid characters in text string
CME ERROR: 26 Dial string too long
CME ERROR: 27 Invalid characters in dial string
CME ERROR: 30 No network service
CME ERROR: 31 Network timeout
CME ERROR: 32 Network not allowed, emergency calls only
CME ERROR: 40 Network personalization PIN required
CME ERROR: 41 Network personalization PUK required
CME ERROR: 42 Network subset personalization PIN required
CME ERROR: 43 Network subset personalization PUK required
CME ERROR: 44 Service provider personalization PIN required
CME ERROR: 45 Service provider personalization PUK required
CME ERROR: 46 Corporate personalization PIN required
CME ERROR: 47 Corporate personalization PUK required
CME ERROR: 48 PH-SIM PUK required
CME ERROR: 100 Unknown error
CME ERROR: 103 Illegal MS
CME ERROR: 106 Illegal ME
CME ERROR: 107 GPRS services not allowed
CME ERROR: 111 PLMN not allowed
CME ERROR: 112 Location area not allowed
CME ERROR: 113 Roaming not allowed in this location area
CME ERROR: 126 Operation temporary not allowed
CME ERROR: 132 Service operation not supported
CME ERROR: 133 Requested service option not subscribed
CME ERROR: 134 Service option temporary out of order
CME ERROR: 148 Unspecified GPRS error
CME ERROR: 149 PDP authentication failure
CME ERROR: 150 Invalid mobile class
CME ERROR: 256 Operation temporarily not allowed
CME ERROR: 257 Call barred
CME ERROR: 258 Phone is busy
CME ERROR: 259 User abort
CME ERROR: 260 Invalid dial string
CME ERROR: 261 SS not executed
CME ERROR: 262 SIM Blocked
CME ERROR: 263 Invalid block
CME ERROR: 772 SIM powered down

GSM Network related codes:

Error Description
CMS ERROR: 1 Unassigned number
CMS ERROR: 8 Operator determined barring
CMS ERROR: 10 Call bared
CMS ERROR: 21 Short message transfer rejected
CMS ERROR: 27 Destination out of service
CMS ERROR: 28 Unindentified subscriber
CMS ERROR: 29 Facility rejected
CMS ERROR: 30 Unknown subscriber
CMS ERROR: 38 Network out of order
CMS ERROR: 41 Temporary failure
CMS ERROR: 42 Congestion
CMS ERROR: 47 Recources unavailable
CMS ERROR: 50 Requested facility not subscribed
CMS ERROR: 69 Requested facility not implemented
CMS ERROR: 81 Invalid short message transfer reference value
CMS ERROR: 95 Invalid message unspecified
CMS ERROR: 96 Invalid mandatory information
CMS ERROR: 97 Message type non existent or not implemented
CMS ERROR: 98 Message not compatible with short message protocol
CMS ERROR: 99 Information element non-existent or not implemente
CMS ERROR: 111 Protocol error, unspecified
CMS ERROR: 127 Internetworking , unspecified
CMS ERROR: 128 Telematic internetworking not supported
CMS ERROR: 129 Short message type 0 not supported
CMS ERROR: 130 Cannot replace short message
CMS ERROR: 143 Unspecified TP-PID error
CMS ERROR: 144 Data code scheme not supported
CMS ERROR: 145 Message class not supported
CMS ERROR: 159 Unspecified TP-DCS error
CMS ERROR: 160 Command cannot be actioned
CMS ERROR: 161 Command unsupported
CMS ERROR: 175 Unspecified TP-Command error
CMS ERROR: 176 TPDU not supported
CMS ERROR: 192 SC busy
CMS ERROR: 193 No SC subscription
CMS ERROR: 194 SC System failure
CMS ERROR: 195 Invalid SME address
CMS ERROR: 196 Destination SME barred
CMS ERROR: 197 SM Rejected-Duplicate SM
CMS ERROR: 198 TP-VPF not supported
CMS ERROR: 199 TP-VP not supported
CMS ERROR: 208 D0 SIM SMS Storage full
CMS ERROR: 209 No SMS Storage capability in SIM
CMS ERROR: 210 Error in MS
CMS ERROR: 211 Memory capacity exceeded
CMS ERROR: 212 Sim application toolkit busy
CMS ERROR: 213 SIM data download error
CMS ERROR: 255 Unspecified error cause
CMS ERROR: 300 ME Failure
CMS ERROR: 301 SMS service of ME reserved
CMS ERROR: 302 Operation not allowed
CMS ERROR: 303 Operation not supported
CMS ERROR: 304 Invalid PDU mode parameter
CMS ERROR: 305 Invalid Text mode parameter
CMS ERROR: 310 SIM not inserted
CMS ERROR: 311 SIM PIN required
CMS ERROR: 312 PH-SIM PIN required
CMS ERROR: 313 SIM failure
CMS ERROR: 314 SIM busy
CMS ERROR: 315 SIM wrong
CMS ERROR: 316 SIM PUK required
CMS ERROR: 317 SIM PIN2 required
CMS ERROR: 318 SIM PUK2 required
CMS ERROR: 320 Memory failure
CMS ERROR: 321 Invalid memory index
CMS ERROR: 322 Memory full
CMS ERROR: 330 SMSC address unknown
CMS ERROR: 331 No network service
CMS ERROR: 332 Network timeout
CMS ERROR: 340 No +CNMA expected
CMS ERROR: 500 Unknown error
CMS ERROR: 512 User abort
CMS ERROR: 513 Unable to store
CMS ERROR: 514 Invalid Status
CMS ERROR: 515 Device busy or Invalid Character in string
CMS ERROR: 516 Invalid length
CMS ERROR: 517 Invalid character in PDU
CMS ERROR: 518 Invalid parameter
CMS ERROR: 519 Invalid length or character
CMS ERROR: 520 Invalid character in text
CMS ERROR: 521 Timer expired
CMS ERROR: 522 Operation temporary not allowed
CMS ERROR: 532 SIM not ready
CMS ERROR: 534 Cell Broadcast error unknown
CMS ERROR: 535 Protocol stack busy
CMS ERROR: 538 Invalid parameter

Citrix XenServer Error: VDI is not Available

Error: Starting VM ” – The VDI is not available

So you’re now trying to boot a VM in XenServer but you are getting the error “VDI is not Available”. This means that VM crashed, Xen Host crashed, or something just bad happen. Either way you need your server back.

  1. Find the UUID of the VDI in question.
    xe vdi-list
  2. Note exactly what UUID maps to which drive is on your server.  This is going to remove the VDI from the VM so we can reattach it correctly.  So drive order does matter, you don’t want to switch an OS VDI with a data VDI.
    xe vdi-forget uuid=<VDI UUID we found in step 1>
  3. Open XenCenter and navigate to the SR with your VDI.  Hit rescan
  4. Now goto your VM with issues and attach the VDI via the storage tab
  5. Boot your VM

Install Dell Openmanage on Citrix XenServer for Nagios checks

Like any good sysadmin, you want to know if anything is happening to your Dell hardware at any given moment.  Here is what I did to get Dell Openmanage installed in Citrix XenServer 5.6, 6.0, and 6.1.    Once openmanage is installed and working, you can then have Nagios ssh into the Xenserver host and run a check (this maybe covered in another post).

  1. I now send you on a quest.  Head to the dell website and start searching for the software.  Get something named “Dell OpenManage Server Administrator Managed Node (Distribution Specific)” or also called “OpenManage Supplemental Pack” or “OpenManage Server Administrator Managed Node” or “OM-SrvAdmin-Dell-Web-LX-7.1.0-5304.XenServer60_A00.iso” or this link?
  2. Transfer the iso to your xenserver host via scp.
  3. mount –o loop <openmanange-supplemental-pack-filename>.iso /mnt
  4. cd /mnt
  5. ./install.sh
  6. /etc/init.d/dataeng start
  7. Logout and back in and this command should work:
    omreport storage pdisk controller=0
  8. /usr/sbin/useradd nagios
  9. passwd nagios
  10. cd /home/nagios
  11. mkdir .ssh
  12. Now we need to generate or install a ssh key for Nagios to login without a password.  Here is how you would generate one:
    ssh-keygen -t dsa -b 1024 -f .ssh/id_dsa
    cat .ssh/id_dsa.pub >> .ssh/authorized_keys
  13. chown -R nagios:nagios .ssh
  14. chmod 750 .ssh
  15. chmod 640 .ssh/*
  16. mkdir bin
  17. chown -R nagios:nagios bin
  18. chmod 750 bin
  19. Get the nagios check script, this will be excuted by Nagios when it logins via ssh
    wget http://folk.uio.no/trondham/software/files/check_openmanage-3.7.3.tar.gz
  20. tar -xzvf check_openmanage-3.7.3.tar.gz
  21. cp check_openmanage-3.7.3/check_openmanage bin/
  22. If you are running Xenserver 6 or higher, you will need to run this command
    chmod o+rx /
  23. Log into your Nagios server
  24. Copy ssh id_dsa/.pub to nagios server, in nagios’s ~/.ssh
  25. Test logging in without a password
  26. Setup nagios checks (I plan posting this someone day)

Helpful links:

Small Changes to Increase Security on Ubuntu Servers

Here is somethings I’ve done to help increase security on my Ubuntu boxes.  The goal when securing a linux system you need to prevent, detect, and react.  These small changes will help in that goal.

Be careful with these changes, as you can lock yourself out of the server.
Also, Ubuntu can use admin(<10.x) or adm(>12.x) is the admin group!!

  1. Increase SSH security by reducing grace time, not allowing root to login (Ubuntu has no root user, but incase you are compromised and a root account is added), and only allow groups you want to login the box.  I run a shell for friends, so in order to allow them to login, I create a “ssh” group and put them into that group.
    Open /etc/ssh/sshd_config
    LoginGraceTime 20
    PermitRootLogin no
    AllowGroups adm ssh
  2. “su” program available to non-admin users
    sudo chown root:adm /bin/su
    sudo chmod 4750 /bin/su
  3. Install more apparmor profiles, read up on apparmor and make sure to think about it when troubleshooting issues.  Sometimes when you don’t use default file paths, apparmor will not allow an application to read/write to locations not whitelisted.
    sudo apt-get install apparmor-profiles
  4. Install denyhosts, this will block bots trying to brutforce you.
    sudo apt-get install denyhosts
  5. Here is an example of my changes to denyhosts
    Edit /etc/denyhosts.conf (diff -U3 denyhosts.conf.orig denyhosts.conf)
    --- denyhosts.conf.orig 2009-07-21 09:54:25.000000000 -0500
    +++ denyhosts.conf      2009-07-21 10:00:59.000000000 -0500
    @@ -57,13 +57,15 @@
    #            'y' = years
    #
    # never purge:
    -PURGE_DENY =
    +#PURGE_DENY =
    #
    # purge entries older than 1 week
    #PURGE_DENY = 1w
    #
    # purge entries older than 5 days
    #PURGE_DENY = 5d
    +# purge entries older than 4 weeks
    +PURGE_DENY = 4w
    #######################################################################
    #######################################################################
    @@ -90,9 +92,9 @@
    # eg.   sshd: 127.0.0.1  # will block sshd logins from 127.0.0.1
    #
    # To block all services for the offending host:
    -#BLOCK_SERVICE = ALL
    +BLOCK_SERVICE = ALL
    # To block only sshd:
    -BLOCK_SERVICE  = sshd
    +#BLOCK_SERVICE  = sshd
    # To only record the offending host and nothing else (if using
    # an auxilary file to list the hosts).  Refer to:
    # http://denyhosts.sourceforge.net/faq.html#aux
    @@ -218,7 +220,7 @@
    # Multiple email addresses can be delimited by a comma, eg:
    # ADMIN_EMAIL = foo@bar.com, bar@foo.com, etc@foobar.com
    #
    -ADMIN_EMAIL = root@localhost
    +#ADMIN_EMAIL = root@localhost
    #
    #######################################################################@@ -285,7 +287,7 @@
    #
    #SYSLOG_REPORT=NO
    #
    -#SYSLOG_REPORT=YES
    +SYSLOG_REPORT=YES
    #
    ######################################################################
  6. In order to whitelist a host from getting into denyhosts, list the ips in this file: /var/lib/denyhosts/allowed-hosts
  7. Make sure changes have been applied:
    sudo /etc/init.d/denyhosts restart
  8. Install performance monitor SAR
    sudo apt-get install sysstat
    Edit /etc/default/sysstat
    Set: ENABLE="true"
    sudo /etc/init.d/sysstat start
  9. Install logwatch and monitor the emails it sends you (root).  This will give you a good overview of your system if you don’t have a syslog server.
    sudo apt-get install logwatch
  10. Install Root Kit Hunter, this a cron job that will check your system for root kits.  It keeps track of your binaries and in case their MD5 changes.
    sudo apt-get install rkhunter
  11. Edit this file /etc/rkhunter.conf and add these changes to the very bottom, these may not work for you but they have been some false positives I needed to whitelist.
    MAIL-ON-WARNING=root@localhost
    ENABLE_TESTS="all"
    DISABLE_TESTS="suspscan hidden_procs deleted_files packet_cap_apps apps"ALLOWHIDDENDIR=/etc/.java
    ALLOWHIDDENDIR=/dev/.static
    ALLOWHIDDENDIR=/dev/.udev
    ALLOWHIDDENDIR=/dev/.initramfs
    ALLOWHIDDENFILE=/dev/.blkid.tab
    ALLOWHIDDENFILE=/dev/.blkid.tab.old
    SCRIPTWHITELIST=/usr/local/bin/lwp-request
  12. After installing rkhunter, you will get emails indicating if there is anything odd happening on your box.  Most of the time it’s from updates, so if you run apt-get upgrade or apt-get dist-upgrade, you need to run this command to update rkhunter:
    sudo rkhunter --propupd
  13. Shared Memory, edit /etc/fstab and add:
    tmpfs           /dev/shm        tmpfs   defaults,noexec,nosuid  0  0
  14. sudo mount -o remount /dev/shm

Citrix XenServer How to Force Shutdown Virtual Machines

Sometimes you just can’t get a VM to shutdown, it maybe an issue with XenTools or sun spots. Here is a list of commands that will help you get that damn thing shutdown.

  1. Disable High Availability (HA) so you don’t run into issues
  2. Log into the Xenserver host that is running your VM with issues via ssh or console via XenCenter
  3. Run the following command to list VMs and their UUIDs
    xe vm-list
  4. First you can try just the normal shutdown command with force
    xe vm-shutdown uuid=<UUID from step 3> force=true
  5. If that just hangs, use CONTROL+C to kill it off and try to reset the power state.  The force is required on this command
    xe vm-reset-powerstate  uuid=<UUID from step 3> force=true
  6. If the VM is still not shutdown, we may need to destroy the domain
  7. Run this command to get the domain id of the VM.  It is the number in the first row of output
    list_domains
  8. Now run this command using the domain ID from the output of step 7
    Before XenServer 7.x:
    /opt/xensource/debug/xenops destroy_domain -domid <DOMID from step 7>
    XenServer 7.x and greater:
    xl destroy <DOMID from step 7>