Home » Posts tagged 'VCDX'
Tag Archives: VCDX
Before I start, its been a while since my last post, mainly because I have been really busy with work and family. Now hopefully I will make it habit to post something useful once every few weeks.
Disclaimer: This is not the **official** recommendation from Nutanix on Cisco ACI. This is just something that I worked on for a client of ours and thought would be useful for anyone who might end up deploying Nutanix + Cisco ACI + new vCenter on Nutanix NDFS :)..
Problem Statement: Cisco ACI requires OOB access to vCenter to deploy the Cisco ACI networks as Portgroups in vCenter. Build vCenter on NDFS. NDFS needs 10 Gb fabric (ideally) from each node. All the uplinks in the leaf switch are controlled by Cisco ACI. But ACI needs vCenter to push out Management and VM Network VLANs.
Some of you might see this and go uh oh, but let me assure you, this also becomes a problem in non Nutanix environments as well; especially anyone using IP based storage and only have 2 x 10Gb adapters.
There are 2 ways in which we can take care of this.
Option 1: Deploy a vCenter in a management only cluster, which doesn’t depend on the Cisco ACI for Networking. (Needs seperate physical infrastructure for networking and for management cluster)
Option 2: Add another dual 10Gb Nic to each of the nodes. (becomes a lot more expensive when you think of tens of nodes x 4 10Gb adapters).
Both the above options are quite costly, be it from a networking physical infrastructure point of view or a management only cluster point of view.
So how do we go about solving this?
I have been seriously thinking and prepping for #VCDX-Cloud. It couldn’t have been more different to think about CMA than when I was starting with prep for VCDX-DCV.
Having said that, came across an interesting discussion on Twitter yesterday where a few guys I know and a few I know on Twitter (you all know who you are) were discussing which VCDX stream should one be focusing on right now. (more…)
Disclaimer: These are my personal thoughts and not the representation of any of the vendors mentioned in the blog below. If I have misquoted a technological aspect or misrepresented any vendor, please let me know and I will either remove it or modify it. All information provided here is as per my understanding and are not statements or facts from the vendor, for accurate support statements please consult your respective vendor.
I have been told multiple times before I submitted my VCDX Design and also during VCDX Boot camps that using pre-built Converged Infrastructure or Hyper Converged Infrastructure might be harder to defend. This becomes a constraint in my mind but definitely not something thats a show stopper.
To truly architect a solution, one has to know how to architect a solution not just with one technology, but be able to swap the technology or product on the go and still achieve the same outcome. This is especially true in case of a VCDX Design submission, as you will be asked for alternatives, around technology/hardware stack that you have used. If you don’t know the alternatives are for your solution, you better read up.
I defended my VCDX design that was based on VCE vBlock, so I speak with experience when it comes to this point. It was hard, but I am not sure it would have been different if it were any other hardware provider. At the end of the day it was only one of the constraints, my solution design met all the requirements of the client and that’s what matters the most. I am sure the same requirements would’ve been met if it was HP, IBM, Hitachi, NetApp or any other solution out there.
This post is not meant to deter people who are working on their VCDX designs based on vBlock or Nutanix or FlexPod or Simplivity or any other vendor. If you don’t know the reasons why your company chose a particular vendor, either speak to the decision maker or read up on proposals put together by the vendor.
You also need to know the pros and cons and possible substitutes for each solution, doesn’t matter what platform it is on. Every platform has its own challenges, from the business aspect to the technological aspect. One of the clients where I worked previously was so much against HP and EMC, they never even entertained RFQ’s from them. The same can be said about all the other vendors.
Lets get back to the point. 🙂 .
Here are a few points which will help you in defending your design based on CI or HCI:
- Know the pros and cons of each aspect of the solution. Even if you haven’t had the final word on any aspect, know why a particular decision has been made. If you don’t agree with it, raise it and get more clarification.
- Know the alternatives provided by other vendors for the same solution. This will definitely help you broaden your decision making abilities.
- Know the limits of what the CI or HCI can do and can’t do ( what is supported and whats not supported).
- A Solution Design should be repeatable and reusable infinitely (within reason). So get the scale up or scale out or up and out decision firm.
- Make changes, the whole premise about having a CI or HCI is a template to start off with. So make the necessary changes (within reason) wherever required.
- Know the operational aspects of the technology you are designing, an architects job is not finished after design. If it can’t be implemented successfully, it’s a failed design. (Similar to when a house crumples down, its not just the builder who messed up, it’s the architect too).
- Test all the facets of the solution and document where the outcomes were not as expected. Re-engineer and re-test until you get the desired outcome.
I think its true that defending a ‘real world’ design with Converged Infrastructure might be harder. Here are the reasons why:
- There are a lot of moving parts in the new age Infrastructure. You will have to design all the components individually and together. So each aspect has to adhere to the availability requirements and together with other components of the design as well.
- You are limited to the hardware options provided by that (Hyper) Converged Infrastructure provider. You are also limited by what each vendors self-imposed limits are. For example, VMware say that you can have 10,000 VMs powered on concurrently, the vendor having tested the theoretical maximum might cut it down to lets say 8000 VMs per vCenter.
- You can’t really swap lets say a VNX 5600 to VMAX in a lower model vBlock or from 2 SSD 4 HDD node to all SSD on the same node on Nutanix (This was more recently announced so it ‘may’ be possible).
- Scaling out or up or up and out is easy with both CI and HCI but becomes very expensive very fast if there is no control. Adding more hardware is never the solution for application related problems.
- If you didn’t do your capacity planning properly, going back to the project board for more money after the procurement is done is usually not something a project manager wants to do, regardless of what technology you use.
- In addition to this, there are multiple facets of the design that have already been decided, like the Recoverability aspect for example. Nutanix (I think) recommend using Veeam, whereas VCE use EMC RecoverPoint/Avamar/DataDomain products. You might not be aware of operational processes of either if you have been using Netbackup or TSM. So one should know what facets of the design are being influenced by using these products.
- When something like a vBlock or Flexpod or Nutanix or Simplivity has already been purchased, the business requirements that were given to the sales / pre-sales team don’t necessarily make sense from a technology point of view for a solution design. So the architect always has to go back to the Sales/Pre-Sales guy for confirmation on requirements as opposed to going directly to the customer. This sometimes can be a good thing, but mostly is not.
The advantage of using the prebuilt (hyper) converged infrastructure is that you architect the complete solution before the infrastructure is on the DC floor and everything comes in pre-built, pre-configured and pre-tested. So you are good to go in a few hours and start putting your workloads on the shiny new toy.
There are a lot of things that one needs to consider when architecting a virtualisation or cloud platform. To truly architect a solution, you should be able to achieve not just what’s in the VCDX Blueprint (although the blueprint gets you 70% there) but also ensure that the customer understands the nuances of using the technology day in day out. If the support staff is not trained in the particular technology, they will have to trained properly. User Experience is paramount when it comes to Virtualisation or Cloud Solutions, if it’s too hard you are doing it wrong.
I know this is going to get some very heated debates about platforms and which one has the better technology, but its not a post about whos better, these are my observations on defending a VCDX design when you are not the absolute decision maker on every single facet of the solution.
I read this blog from Chris Colotti about the VCDX fear of failure. So I thought I’d share my thoughts on that issue. As someone who has failed VCDX on the first attempt (Spoiler alert) Here is what I think
Failure is not the end
Fear of failure is what makes humans what they are. It builds character. It shows the kind of person you are. Ask any VCDX, they will tell you its about the path not the end result. When I failed my first defence I was angry, even ashamed. But that drove me to strive harder, study harder, practise harder.
I believe “Ever tried. Ever failed. No matter. Try Again. Fail again. Fail better.“ is the best motto for anything in life. The only profession when failure is not an option is as a doctor but even they fail sometimes. (Both my parents are doctors so I know how hard they strive to save every single life).
I know a few people who can’t accept failure in getting through to VCDX defence or passing the defence. They think its a personal insult that they haven’t succeeded in getting VCDX. Its not an insult, if any it just means that you might not have had a great day in the office. Everyone has an off day or 2 or 3.
For me its still only a certification, though being a VCDX is definitely something to be proud of. But aspiring to be a VCDX or for that matter anything that validates you and your knowledge is worth the fear of failure.
Signing off with this..
Many of life’s failures are people who did not realize how close they were to success when they gave up.Thomas A. Edison
Xtreme IO is the newest and fastest (well EMC say so) All Flash Array in the market. I have been running this in my “lab” running a POC which is quickly turning into a major VDI refresh for one of the clients. Having run throug the basics of creating storage and monitoring alerting etc in my previous posts., I am going to concentrate on what parameters we need to change in the vSphere world to ensure we get the best performance from Xtreme IO.
The parameters also depend on what version of ESXi you’re using, as Xtreme IO supports ESXi 4.1 + .
Without further delay, lets start.
Adjusting the HBA Queue Depth
We are going to sending a lot more IO through to the Xtreme IO array than you would to the traditional hybrid array. So we need to ensure that the HBA queue depth is allowing a lot more IO requests through.
You can find out the module by using the command
Step 1: esxcli system module list | grep ql (or lpfc for emulex)
Once you find out the module that is being used. The command below can be used to change the HBA queue depth on the server.
Qlogic – esxcli system module parameters set -p ql2xmaxdepth=256 -m qla2xxx (or whatever is the module from the command in Step 1.)
Emulex – esxcli system module parameters set -p lpfc0_lun_queue_depth=256 -m lpfc820 ( or whatever is the module from the command in Step 1)
If you are not going to use Powerpath, since its an active active X number of controllers array (yeah, i know its got 2 controllers per disk shelf so as of today you can scale upto 6 disk shelves per cluster so 12 controllers), we will be using Round Robin if using NMP.
The engineers who work with Xtreme IO recommend that the default number of iops be changed from 1000 to 1, yes “ONE”. So essentially you are sending an IO request to each controller in the cluster. I haven’t really seen any improvement in the performance by doing so but it is only a recommendation at the end of the day. If you see that you are not going to achieve any significant performance by doing so, the onus is on you to make that decision.
First, lets get all the volumes that’ve been configured on Xtreme IO.
esxcli storage nmp path list | grep XtremeIO
this will give you the naa.id of all the volumes that are running on XtremeIO.
Now lets set the policy to RR for those volumes.
esxcli storage nmp device set — device <naa.id> -psp VMW_PSP_RR (5.x)
esxcli nmp device setpolicy — device <naa.id > –psp VMW_PSP_RR (4.1)
You can also set the default path selection policy for any storage in 5.x by identifying the SATP and modifying it with the command
esxcli storage nmp satp set –default-psp=VMW_PSP_RR —satp =<your_SATP_name>
To set the number of IOs to 1 in RR,
esxcli storage nmp psp roundrobin deviceconfig set -d <naa.id> –iops 1 –type iops (5.x)
esxcli nmp roundrobin setconfig –device=<naa.id> –iops=1 (4.1)
Of course if you dont want to go change all of this, you can still use Powerpath.
Host Parameters to Change
For best performance we also need to set a couple of disk parameters. You can do this via GUI or the easier way via CLI (preferred).
Using GUI, set the following parameters Disk.SchedNumReqOutstanding to 256 & Disk.SchedQuantum to 64
Note: If you have non Xtreme IO volumes on these hosts, they may lead to over stress on the controllers and cause performance degradation while communicating with them.
Using Command line in 4.1, set the parameters using
esxcfg-advcfg -s 64 /Disk /SchedQuantum
esxcfg-advcfg -s 256 /Disk /SchedNumReqOutstanding
to query that its been set correctly, use
esxcfg-advcfg -g /Disk /SchedQuantum
esxcfg-advcfg -g /Disk /SchedNumReqOutstanding
You should also change the Disk.DiskMaxIOSize from the default of 32767 to 4096. This is because XtremeIO reads and writes by default in 4k chunks and thats how it gets the awesome deduplication ratio.
In ESXi 5.0/5.1 you can set the SchedNumReqOutstanding by using
esxcli storage core device set -d <naa.id> -O 256
In vSphere 5.5 you can set this paramter on each volume individually instead of configuring on per host.
vCenter Server Parameters
Depending on the number of xBricks that are configured per cluster, the vCenter server parameter
config.vpxd.ResourceManager.maxCostPerHost needs to be changed. This adjusts the maximum number of full cloning operations.
One xBrick Cluster – 8 (default)
Two xBrick Cluster – 16
Four xBrick Cluster – 32
Thats the end of this post. Please feel free to correct me if I’ve got any commands wrong.
Recommendation (as per AusFestivus’ comment): EMC recommend that PP be used for best performance. But it always comes down to the cost constraints and how much the client wants to spend. In my opinion, PP is more like “nice to have for best performance without tinkering”. But if you can keep tinkering and changing things to get the best performance out, you can do without PP.
If you missed Part 1, here it is.
So I was in SFO for VCDX defence. My wife and kid joined me on the trip. She has family in Cali, she went to see them, I stayed back. I was nervous, didn’t know if I will make it. After sleeping through most of the day, decided to catch up with people whom I’d known only on twitter and via webex. It was late, so I caught up with Brad, Kalen, Garette and Mark. Had a couple of beers and went back.
Nutanix (@nutanix) had offered to provide us with a meeting room where we could do mock sessions. We did some mocks when quite a few VCDXs dropped in and gave us a lot of tips. We attended the VCDX bootcamp. Next day we did the same, only change was that we did design scenarios and TS scenarios. We had fun, Kalen Ardnt gave us TS scenarios from his experience at VMware Support. A word of caution, he is mad, absolutely mad. So if you see him in TS scenario you can be rest assured that he is going to fry your brain and have it with some sauce (along with a few laughs for sure). That was the Sunday, Monday came and went, I was holed up in a hotel room studying.
Tuesday morning and I was a nervous wreck, couldn’t breathe or stay put. I was pacing up and down the hotel room. It was time to leave and I walked to the Hilton where the defence was scheduled.
Mark Brunstad and John Arrasjid met with me and took me to the holding area. There were 2 panels happening at the same time. Also in my defence room were an observer and 2 new panelists in training. John introduced me to everyone in the room. And it began. I was nervous, my hands were sweating and I was fidgety at best for the first 5 mins. Then I began to enjoy myself in the defence. It went reasonably OK, though I was still disappointed with my effort. I enjoyed the defence and TS scenario a lot more than the design scenario.
I couldn’t answer some questions that the panelists asked me and skipped over a couple with vague answers. It was done. I came out, met Magnus Edh (@vTerahertz, #140) and we exchanged how it went etc. I wasn’t happy with my design scenario and thought could’ve done better in design defence. That evening, I was smuggled into the VCDX/vExpert Party by Brad and had the chance to rub shoulders with the giants of industry present there. Also saw VCDX-Alpha being presented to CEO Pat Gelsinger for his vision and contribution to the program.
Thursday evening, went to visit my wife’s family and got a text from Brad asking if I had made it. All along I thought I would get the result when I was back in Australia. I opened the email and it wasnt good news. I was furious with myself and couldn’t enjoy the family time. I still had to do Vegas, LA and Disneyland. I wish there was a hole for me to curl up into and not come out. Next day, I thought ‘you know what, I am going to try again. Its not the end of the road’.
VCDX Attempt #2
I enjoyed the rest of the trip to US, didn’t really like Vegas, loved Disneyland and spending time with Family. I came back pulled up my socks, starting working on my design again. I submitted the same design incorporating the feedback given by the panel. I got accepted to defend it in Singapore, July 7.
Sleepless nights began again. Mocks with Josh Odgers, Grant Orchard week before defence. I gave it a final push. Saturday 5th July, I injure my big toe in a stupid accident and had to get half the nail taken off (yep it happened).
— Harsha Hosur (@harsha_hosur) July 5, 2014
Flying out Sunday Morning I was completely under the influence of pain killers and could barely walk. That 9 hour flight was the longest and the most painful flight of my life. After landing, I caught up with Andrew Brydon (@sidbrydon, #139) and had McDonalds for dinner.
I woke up in immense pain and took the pain killers again. It was time to get ready. I tried to put shoes on and it was impossible to take a step or stand with shoes on. I decided to go in Sandals. It took me 25 mins to walk to the defence location and it was less than 5 min walk. It wasn’t looking good.
I defended the design well and was ok in TS scenario, and in my opinion didnt do as well in design scenario. I was limping all the time, walking up to flip chart to draw something back to the middle. It was painful, so much so that after my design defence I took 4 more pain killers to numb the pain. I couldn’t think straight for the first 10 minutes in the design scenario. I felt stupid taking those pain killers, I cursed myself in my brain, I couldn’t bring words to my mouth that my brain was processing. It was Hell. I thought ‘If I don’t make it this time I am done. I am so done’.
I walked back, changed the blood soaked dressing on my toe, had a drink with Andrew and distinctly remember saying “I f****d up the design scenario as I couldn’t think for the first 10 mins”. We did a quick design scenario mock and I went back to Changi to catch my flight back. My flight was delayed by 3 hours. So more painkillers cos of the walking..
2 days later, I woke up and glanced at my phone blinking green. I saw Andrew Daunce’s tweet congratulating me. I check my email on the phone..NOTHING. Check it on my mac..NOTHING. I felt as if there was a knife in my chest. I sent a quick message off to Mark Brunstad asking for an update. Then someone posted the VCDX directory snippet with my name in it. I’d done it. Yes it is an achievement..
— Gregg Robertson (@GreggRobertson5) July 10, 2014
For everyone and anyone who thinks failure is a step back, no for VCDX its not. You’ll learn better from it and not make the same mistakes.
“Ever tried. Ever failed. No matter. Try Again. Fail again. Fail better.”
A special thanks to all the people below in no particular order
My wife (@harneet_brar) and my 4 year old son (Pillars of my life), Josh Odgers (@josh_odgers), Grant Orchard (@grantorchard), Arron Stebbing (@arronstebbing), Andrew Brydon (@sidbrydon), Derek Seaman (@vDerekS), Brad Christian (@bchristian21), Josh Coen (@joshcoen), Jeff Mercier, Sean Howard, Richard Arsenian (@richardarsenian), Kalen Ardnt (@kalenardnt), James Charter (@Davesrant), Tim Antonowicz (@timantz), Mark Gabryjelski, Michael Webster (@vcdxnz001), Manish (@mandivs), Mark Brunstad (@markbrunstad), John Arrasjid (@vcdx001), Duncan Epping (@duncanyb), Frank Denneman (@frankdenneman), Andrew Mitchell (@amitchell01), Rene (@vcdx133) and everyone in the VMware community.
This is your victory too 🙂
After a long and arduous journey over a few months starting november 2013, I finally achieved what I wanted to do since 2009. I am VCDX-DCV #135.
This blog post is to help those push through the initial hesitation and also help those who’ve stumbled in the first go. It’ll be a bit long so bear with me. Here we go …
I got my VCP4 in 2008 when I was working as a Senior Wintel Engineer. I had worked on ESX 3.0 and 3.5 initial releases. I was really awestruck by how simple it was to setup a basic cluster and the advantages that it gave the workloads. I still remember my first vMotion experience and it was awesome. I wanted to know more do more with this awesome technology. So began my journey to do the VCAPs. I got my VCAP DCD4 first and then followed it up with VCAP DCA4 a year later. Early 2009 the thought of doing my VCDX came to my mind, I’d just attended my first VMUG and the session was with the first VCDX in Australia, Andrew Mitchell (#30).
A lot of time passed between 2009 and 2013 when I finally got enough courage to put my VCDX Application through in December 2013. I had requested some feedback from Will Huber (@huberw, #81), Josh Odgers (@josh_odgers, #90) and Manish (@mandivs) with my documentation. I got positive feedback from them so I felt OK. Then got the reply from Mark Brunstad (VCDX Program Director) in Jan 2014 that my application was successful for defence at Partner Exchange, SFO on Feb 11. I was so happy that I was one step closer to getting VCDX.
I immediately logged onto twitter to see who else had been accepted. I found quite a people who’d tweeted that they were doing the same, Derek Seaman, Kalen Ardnt, Mark Snook, Garette, Brad Christian, Jeff Mercier, Josh Coen, Sean Howard & Safouh. Also known as #VCDXPEX14CREW
Brad Christian (@bchristian21) was kind enough to start a group mock defence each night for 2 weeks and I attended a few, missed a few (including mine thanks to my 4 yr old) but overall learnt a lot.
I also contacted Josh Odgers, who in my opinion is one of the best guys to know and work with, and asked him to help me out with a couple of mock panels so as to know where I stood. Little did I know that there was one more candidate, whom Josh was helping out, Richard Arsenian (VCDX #126). Richard and I did one mock panel each and I was terrible. I convinced Josh and Richard for one more go the next day. I redid the whole presentation, still couldn’t answer a few questions properly. The day was here to fly to SFO..
Its long enough.. lets take a break.
To be continued..